NiFi how to store flow data in memory or disks

Tags:

Can someone explain in details how NiFi processors like GetFile or QueryDatabaseTable store the rows when the next processor is not available to receive or process any data? Would the data gets piped up in memory and then gets swapped to disks when the size exceeds some threshold? Potentially would it have the risk of running out of memory or losing data?

678

asked May 15 '17 05:05

Shengjie

1 Answers

I would recommend reading the Apache NiFi documentation, specifically the "Apache NiFi in Depth" document to understand how data is stored and passed through NiFi:

https://nifi.apache.org/docs/nifi-docs/html/nifi-in-depth.html

The short answer is that data is always written to disk in NiFi's internal repositories. A flow file has attributes that are persisted to the flow file repository and content that is persisted to the content repository. The content is not held in memory unless a processor chooses to read the entire content into memory to perform some processing.

When flow files are in a queue, none of the content is held in memory, just flow file objects that know where the content lives on disk. When the queue reaches a certain size, these flow file objects will be swapped to disk which allows you to have a queue with millions of flow files, without actually having a million flow file objects.

There is also a concept of back-pressure to control a maximum size of a queue based on number of flow files, or size of all flow files in a queue.

151

answered Sep 30 '22 17:09

Bryan Bende

Related questions
                            
                                How are Cassandra's 0.7 Secondary Indexes stored?
                            
                                Easiest way to serialize and store objects in c#?
                            
                                non-static declaration following static declaration
                            
                                retrieve the saved state for a collection of checkboxes from local storage after page reload
                            
                                How to retrieve images from cache memory in picasso?
                            
                                how to remove data from a object using chrome storage?
                            
                                Using Firebase Storage in server
                            
                                What is the purpose of the capacity parameter on a Persistent Volume that is in ReadOnlyMany mode?
                            
                                How to get the roothash or a proof from a child trie in substrate?
                            
                                Trying to upload to Google cloud storage using Superbalist/flysystem-google-cloud-storage
                            
                                Android: good practices for organizing dirs and files on storage SD card?
                            
                                What's the storage size of BIT(1)?
                            
                                iOS App Database Choice
                            
                                Programmatically move files from cache directory to SDCard
                            
                                Storage format in HDFS
                            
                                'Cloud' filesystem storage does not work - Laravel 5.1
                            
                                Init Layer in Docker
                            
                                Binary Delta Storage
                            
                                What data type can I use for very large text fields that is database agnostic?
                            
                                Why is MySQL's data_free larger than data and indexes combined?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

NiFi how to store flow data in memory or disks

Tags:

storage

persistence

apache-nifi

Shengjie

People also ask

1 Answers

Bryan Bende

Recent Activity

Donate For Us