Generating geometrically spaced trajectory data with geo-rev11.py and geometricdump.com,
and analyzing it with 2_time_geom.py and s4_tb_fft_ww_cycframes.py.
← / → or space · d toggles dark mode · click the slide counter to jump
2_time_std.py and the linear S4 scripts work on linearly spaced dumps. That is fine until the
dynamics spread over many decades: a correlation function can only be evaluated at time separations that exist
in the saved trajectory, and a single linear spacing cannot cover all the decades at once.
Numbers for the production region used throughout these slides (106 steps ≈ 100 τα, 100k particles, ~1.2 MB per frame):
| Dump schedule | Frames / run | Size / run | Lag coverage |
|---|---|---|---|
| linear, every step | 106 | ~1.2 TB | everything |
| linear, every 1000 steps | 1,000 | 1.2 GB | nothing below 1 LJ; the first 3 decades are missing |
| geometric (25 dumps/cycle) | 2,775 | 3.3 GB | 1 step to full span, uniform in log t |
A single geometric series from step 1 to the end of the run would solve storage. It fails for a different reason. Correlation functions are averaged over initial times t0: the same separation t is measured from many starting frames. The translational overlap, for example:
The schedule is therefore built in two levels:
Cost: separations between the cycle length Λ and its multiples are coarser. In practice the set of available separations stays dense enough (slide 13 shows it exactly).
Generated live with the geo-rev11.py --small --old algorithm. Each tick is one frame.
The time axis is fixed at 160 LJ, so changing -t or the cycle count changes the coverage you see.
Top: the dumps in real (linear) time. Bottom: one row per cycle, each showing the time of its dumps measured from that cycle's own start, on a log axis. The rows are identical: every cycle repeats the same pattern.
Prints three text files that a LAMMPS loop reads line by line. One frame per DCD file.
| File | Contents |
|---|---|
run_times.txt | steps to run before each dump (argument to run) |
dump_times.txt | cumulative timestep of each dump (dump trigger / bookkeeping) |
file_names.txt | output filename for each frame |
| Option | Meaning |
|---|---|
-a [float] | (LJ time units) first geometric term. Without --old, this is the first runtime; with --old, this is the first dump offset. Values below one timestep floor to one timestep. |
-t [float] | (LJ time units) endpoint used with -a and -n to set the geometric ratio. With --old, it is the sequence span; without --old, it is the largest single runtime. |
-n [int] | (count) target number of geometric terms. Terms whose runtime floors to zero timesteps are dropped, so the true dump count can be lower than this target. |
-d [float] | (LJ time units) simulation timestep, used to convert LJ times to integer timesteps. |
-r [int] | (count) number of times to repeat the generated sequence back to back. |
-c [int] | (file index) initial filename counter before each generated file increments it. Set this to the last already-used index; the first generated file is -c + 1. |
-m [int] | (timesteps) offset added to every cumulative dump time, used to continue the timestep counter after a previous sequence. |
--small | (mode flag) generate the simple geometric-series schedule used by this workflow. Without it, geo-rev11.py uses the older long-series mode. |
--old | (mode flag) treat the geometric terms as cumulative dump offsets and compute runtimes as their differences. This makes every repeat have identical dump offsets. |
--test | (diagnostic flag) print diagnostics instead of loop-file output. Do not use this output as LAMMPS schedule files. |
loopvar sets the number of loop iterations, which should equal the number of lines in the generated files.variable ... file reads one line at a time from run_times.txt, file_names.txt, and dump_times.txt.dump_modify unwrap yes writes unwrapped coordinates, which the downstream analysis expects.run ${eruntime} advances exactly to that dump time, so the file receives one frame, then undump closes it before the next iteration.next advances the loop counter and each file-backed variable to the next schedule line.jump SELF loopstart repeats until loopvar is exhausted, producing one DCD file per generated schedule entry.runtimek = ⌊t0 rk⌋; dump time = running sum.
-t sets the largest single gap. The sequence ends wherever the sum lands.
Used for the long sequence.
dump offsetk = ⌊t0 rk⌋ exactly; runtime = difference from the previous dump.
The sequence ends exactly at -t, and every repeat reproduces identical offsets.
Used for the short sequences. The analysis scripts verify that all cycles are identical, and this mode guarantees it.
Top: -a 0.001 -t 9 -n 27, one sequence. Bottom: the same flags with --old and -r 3, on the same linear axis.
Both production commands use --small; the non---small path is the legacy long-series mode and predates the current workflow.
System is equilibrated before recording. Every cycle start is an equivalent initial time.
Analyze with 2_time_geom.py and s4_tb_fft_ww_cycframes.py.
During aging, correlations depend on the waiting time tw, and the evolution is slow in log tw,
so one geometric series samples it with a few hundred frames. Analyze the long region with 2_time_aging.py
(explicit initial frames); analyze the short region with the cyclic scripts, starting at the right file index (-z).
You can hand-pick geo-rev11.py flags, but they interact: the ratio depends on -a, -t, -n,
and the stitch values -c and -m depend on the long sequence's output.
The site takes system-level inputs and derives the flags.
-t = τα/2, -n searched so the last dump lands on τequil-n 27, -r = total/length-c, -m from the long output (+ the step-0 file)The reason for the specific defaults is dump density at τα: next slide.
A schedule can look reasonable and still measure τα badly. The available separations get sparse toward the end of a cycle and dense again right after the cycle boundary. The cycle length decides which region τα falls in.
At the default Λ = 0.9 τα, τα sits just past the cycle boundary, in dense spacing. Push Λ above τα and τα lands in the sparse tail of the cycle, where the nearest available separations are far apart.
dt = 0.001, τα = 10 LJ, τequil = 200 LJ, production 100 τα.
Long sequence: one sequence of geometrically growing runtimes,
largest runtime τα/2 = 5 LJ, with 338 terms (the website searches -n for
the value whose summed runtimes land closest to τequil; here the sum is 200,148 steps ≈ 200 LJ),
first runtime as small as possible, filename counter starting at 1 (the step-0 dump took short1.dcd),
timestep counter starting at 0:
Short sequences: 111 repeats of a sequence whose dump times are geometric (--old), each spanning 9 LJ = 0.9 τα with a goal of 27 dumps and the first dump 1 step in, continuing the filename counter from the long sequence and the timestep counter from its last dump:
| Long | Short (production) | |
|---|---|---|
| dumps | 338 | 111 cycles × 25 = 2,775 |
| files | short2.dcd – short339.dcd | short340.dcd – short3114.dcd |
| timesteps | 1 – 200,148 (≈ 200 LJ) | 200,149 – 1,199,148 (≈ 1,000 LJ) |
One short sequence (production cycle), in steps. Dump times, measured from the start of the sequence (the last is 9 LJ / 0.001 = 9,000):
and the corresponding runtimes (the gaps between dumps):
The analysis scripts don't read dump_times.txt directly. The schedule is recovered from the DCD metadata.
2_time_geom.py and s4_tb_fft_ww_cycframes.py both start with these steps
(L = frames per cycle, the -c / -G argument):
Step 2 is why --small --old matters at generation time, and why a wrong -c, -G,
or -z fails immediately instead of producing wrong numbers.
-g N asks for N lag values, geometrically spaced between a first lag and the longest separation in the data.
The data only contains specific separations, so each requested value is moved to the closest one available.
When two requests land on the same separation it is kept once, which is why fewer values come out than went in.
-U sets the first requested lag. -U n (positive) starts n frames in.
-U 0, the old default, starts at 1.0 LJ, so no separations shorter than 1 LJ get requested. Try it below.
Same observables, same output format. The difference is the dump schedule each one expects.
traj<n>.dcd, count -n-d frames-a), Fs (-q), orientational correlations (--polyatomic*)-r runs)-s/-m, last initial time -k-f / -g / -w, --low/high-intervalshort<n>.dcd, count -t, first file -z-c, the true frames per cycle-UFor the worked example: average over 5 runs, reading 2,775 short files per run starting at short340.dcd, with 25 frames per cycle, requesting 50 geometric lags with the first lag one frame in:
* file tests: before computing, the script checks that every expected DCD file exists and opens readably
(--skip-file-tests skips this part; the cycle verification always runs). Needs at least 2 full cycles of frames.
--negvals is forced off in the geom version.
S4(q, tb): spatial correlations of mobility. For each interval length tb, particles get a mobility weight w(t0, t0+tb), the weights are binned on an FFT grid, and the structure factor of that field is computed. Output per tb: total, self, and distinct parts, with errors from across-run scatter. At least 2 runs required.
--W1-theta 0.25 (step function on displacement, same form as the overlap on slide 3) is the usual choice.
Also available: --W1-gauss, --W1-exp, --W1-dir-theta (orientational, needs --polyatomic).-a) and tc (-c) offset the two intervals or make them unequal.
We usually leave both at 0, which compares an interval with itself. Nonzero values require explicit --W2-* options.-q discrete region, -v shell region, -l shell count).-x FFT grid size, -y box size.Same S4 computation. cycframes adds the cycle steps from slide 12, plus one extra time check.
traj<n>.dcd via -n-d initial-time spacing in frames (default 10)-M margin in multiples of -d-x FFT grid, -y box size (both required)-a), tc (-c), usually 0; W1/W2 flags-q / -v / -l-k last initial time; -i one output file per tb; -r ≥ 2-N (default short), count -n, first file -z-G, the true frames per cycle-U, --custom-geom-base-d in cycles (default 1); -M in cycles-S; mismatches are skippedFor the worked example: average over 5 runs, reading 2,775 files named short*.dcd starting at index 340, with 25 frames per cycle, 30 requested tb values, a 24³ FFT grid for a box of side 47.0, and a Heaviside mobility function (translational overlap) with threshold radius 0.25:
| Issue | Behavior / fix |
|---|---|
| re-running geo-rev11 without deleting outputs | schedules concatenate silently (append mode) |
| set_len is the true dumps per cycle, not the goal | goal -n 27 produces 25 surviving dumps, so -c 25 / -G 25. The wrong value fails the cycle check |
| forgetting the step-0 dump | it consumes the first filename; first generated file is index 2, and the first production file in the example is short340.dcd (1 + 338 + 1) |
wrong -z in long+short data | long-sequence frames enter the cycle check and it fails |
--skip-file-tests | skips only the existence/readability tests; the cycle check always runs |
-U < 0 | unsupported; the script warns that this path is incorrect, so leave -U nonnegative |
| S4 with 1 run | hard error; errors come from across-run scatter |
ta or tc ≠ 0 without --W2-* | hard error |
| too few frames | 2_time_geom needs at least 2 cycles; cycframes warns if -M leaves too little room and may fail at the last tb |
The worked example end to end: dt = 0.001, τα = 10 LJ, τequil = 200 LJ, production 100 τα, 5 runs.
-c / -G take the true dumps per sequence (25), not the -n goal (27).
-z 340 = 1 step-0 file + 338 long files + 1.
geo-rev11.py | generates the schedule files (run_times, dump_times, file_names) for the LAMMPS loop |
| geometricdump.com | derives the geo-rev11 flags from dt, τα, τequil; emits both commands and the files |
2_time_geom.py | MSD, overlap, Fs, orientational correlations on cyclic dumps |
s4_tb_fft_ww_cycframes.py | S4(q, tb) on cyclic dumps |
2_time_aging.py | two-time correlations vs (tw, tw+t) for the long-sequence region |
Repo docs: files_readme/, geometric-dumping/README.md.