How to pull data from Mainframe to Hadoop

1 Answers

COBOL is a programming language, not a file format. If what you need is to export files produced by COBOL programs, you can use the same technique as if those files were produced by C, C++, Java, Perl, PL/I, Rexx, etc.

In general, you will have three different data sources: flat files, VSAM files, and a DBMS such as DB2 or IMS.

DMBSs have export utilities to copy the data into flat files. Keep in mind that data in DB2 will likely be normalized and thus you likely need the contents of related tables in order to make sense of the data.

VSAM files can be exported to flat files via the IDCAMS utility.

I would strongly suggest you get the files into a text format before transferring them to another box with a different code page. Trying to deal with mixed text (which must have its code page translated) and binary (which must not have its code page translated but which likely must be converted from big endian to little endian) is harder than doing the conversion up front.

The conversion can likely be done via the SORT utility on the mainframe. Mainframe SORT utilities tend to have extensive data manipulation functions. There are other mechanisms you could use (other utilities, custom code written in the language of your choice, purchased packages) but this is what we tend to do in these circumstances.

Once you have your flat files converted such that all data is text, you can transfer them to your Hadoop boxes via FTP or SFTP or FTPS.

This isn't an exhaustive coverage of the topic, but it will get you started.

100

answered Oct 19 '22 03:10

cschneid

Related questions
                            
                                "Connection refused" Error for Namenode-HDFS (Hadoop Issue)
                            
                                What is the maximum value for mapreduce.task.io.sort.mb?
                            
                                Why Hadoop or Spark? There is ElasticSearch
                            
                                How can I debug a pig script
                            
                                How can I list subdirectories recursively for HDFS?
                            
                                Duplicate columns in Spark Dataframe
                            
                                Structure Difference between partitioning and bucketing in hive
                            
                                Hadoop HDFS maximum file size
                            
                                Partition Hive table by existing field?
                            
                                Hadoop read multiple lines at a time
                            
                                Hadoop slowstart configuration
                            
                                Why is Maven trying to compile my code as -source 1.3?
                            
                                Name Node stores what?
                            
                                Hadoop log4j not working as No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory)
                            
                                GlusterFS or Ceph as backend for Hadoop
                            
                                Spark + Scala transformations, immutability & memory consumption overheads
                            
                                Difference between 'distcp' and 'distcp -update'?
                            
                                Filter a string on the basis of a word
                            
                                How can I concatenate two files in hadoop into one using Hadoop FS shell?
                            
                                What does CPU Time for a Hadoop Job signify?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to pull data from Mainframe to Hadoop

Tags:

hadoop

mainframe

azzaxp

People also ask

1 Answers

cschneid

Recent Activity

Donate For Us