We have updated our Terms of Service, Code of Conduct, and Addendum.

We want to enable persistent queuing and want to know how large we can make the max file size?

Options

We are enabling persistent queuing (PQ) on our destinations and notices that the Max File size is defaulted to 1MB. With the amount of data we are processing that will lead to A LOT of small files so we were thinking of increasing that to a Max File Size of 100MB. However, before we do that - we wanted to know if there was any potential impact to Cribl if the PQ files are to large ?
When Cribl is reading these files to send to the destination, does it read the entire file into memory ?

Best Answer

  • Brandon McCombs
    Brandon McCombs Posts: 150 mod
    Answer ✓
    Options

    I recommend either sticking with the 1 MB or a compromise at, say, 10 MB but the smaller the
    better.

    The reason is that we cant delete a file upon flushing until all data from the file has been flushed so the faster we can flush all data from a file and the smaller it is then the faster we can free up the storage it consumes. The app wont flush faster because the file is smaller but the app can remove a file sooner if its smaller.

    Also note that worker processes write to their own files so the data going into a given file is only being written to by 1 process.

    Offhand Im unsure if each PQ file is loaded into memory when its data is sent to the destination. However, even if they are, keep in mind that Stream reads them sequentially so itll only be reading from 1 file at a time for any given destination and worker process combination.

Answers

  • Brandon McCombs
    Brandon McCombs Posts: 150 mod
    Answer ✓
    Options

    I recommend either sticking with the 1 MB or a compromise at, say, 10 MB but the smaller the
    better.

    The reason is that we cant delete a file upon flushing until all data from the file has been flushed so the faster we can flush all data from a file and the smaller it is then the faster we can free up the storage it consumes. The app wont flush faster because the file is smaller but the app can remove a file sooner if its smaller.

    Also note that worker processes write to their own files so the data going into a given file is only being written to by 1 process.

    Offhand Im unsure if each PQ file is loaded into memory when its data is sent to the destination. However, even if they are, keep in mind that Stream reads them sequentially so itll only be reading from 1 file at a time for any given destination and worker process combination.