We have updated our Terms of Service, Code of Conduct, and Addendum.

How to Ignore Timestamps in S3 Collector?

Options
padilla
padilla Posts: 7 mod

For the S3 collector if I have events with a _time field from 3/1/2022 in a file created today, so it is in a path of /dir1/<today’s_date>/dir2/. Will the earliest/latest values filter those events out if I select -24h? I do not want to filter those out, but do want to be able to collect older events that get put into newly created paths.

Answers

  • Megan Davis
    Megan Davis Posts: 10
    Options

    Add a filter, so it only applies to the directories of interest (e.g. source.startsWith(‘dir1) && source.includes(‘dir2).
    Change the timestamp portion in the Event Breaker so that timestamp is NOT extracted.
    Instead leave the timestamp processing to a pipeline.
    To force the no time extraction in the event breaker, scan with a depth of 2

  • Jon Rust
    Jon Rust Posts: 431 mod
    Options

    Id recommend changing the way the files are written to use the event date and time, not the current date and time, in the file path. Unless the event time is in the path, there is no way to use the timestamp constraints in the collector. Youd have to instead rely on pattern matching inside the event which is much more costly performance wise.