Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Use Cases of NIFI

Tags:

apache-nifi

I have a question about Nifi and its capabilities as well as the appropriate use case for it.

I've read that Nifi is really aiming to create a space which allows for flow-based processing. After playing around with Nifi a bit, what I've also come to realize is it's capability to model/shape the data in a way that is useful for me. Is it fair to say that Nifi can also be used for data modeling?

Thanks!

like image 543
BigBug Avatar asked Jun 14 '16 17:06

BigBug


People also ask

What do you use NiFi for?

Put simply, NiFi was built to automate the flow of data between systems. While the term 'dataflow' is used in a variety of contexts, we use it here to mean the automated and managed flow of information between systems.

Can NiFi be used for ETL?

Apache NiFi is an ETL tool with flow-based programming that comes with a web UI built to provide an easy way (drag & drop) to handle data flow in real-time. It also supports powerful and scalable means of data routing and transformation, which can be run on a single server or in a clustered mode across many servers.

Can NiFi be used for CDC?

All these tools come with an enterprise licence version that supports multiple features. But when it comes to the open-source Tool for CDC, Then the Apache NiFi is one of the open-source tools that limited support CDC with MySQL Database.

What companies use Apache NiFi?

Several hundred companies worldwide are using NiFi, including ExxonMobil, AT&T, and British Gas. Silicon Valley company Hortonworks developed two NiFi products as part of its open source big data technology offerings.


1 Answers

Data modeling is a bit of an overloaded term, but in the context of your desire to model/shape the data in a way that is useful for you, it sounds like it could be a viable approach. The rest of this is under that assumption.

While NiFi employs dataflow through principles and design closely related to flow based programming (FBP) as a means, the function is a matter of getting data from point A to B (and possibly back again). Of course, systems aren't inherently talking in the same protocols, formats, or schemas, so there needs to be something to shape the data into what the consumer is anticipating from what the producer is supplying. This gets into common enterprise integration patterns (EIP) [1] such as mediation and routing. In a broader sense though, it is simply getting the data to those that need it (systems, users, etc) when and how they need it.

Joe Witt, one of the creators of NiFi, gave a great talk that may be in line with this idea of data shaping in the context of Data Science at a Meetup. The slides of which are available [2].

If you have any additional questions, I would point you to check out the community mailing lists [3] and ask any additional questions so you can dig in more and get a broader perspective.

  • [1] https://en.wikipedia.org/wiki/Enterprise_Integration_Patterns
  • [2] http://files.meetup.com/6195792/ApacheNiFi-MD_DataScience_MeetupApr2016.pdf
  • [3] http://nifi.apache.org/mailing_lists.html
like image 107
apiri Avatar answered Sep 24 '22 07:09

apiri