Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

New posts in bigdata

Apache Spark: impact of repartitioning, sorting and caching on a join

Processing a very large text file with lazy Texts and ByteStrings

Send KafkaProducer from local machine to hortonworks sandbox on virtualbox

Implementing custom Spark RDD in Java

apache-spark bigdata

Spark Scala Understanding reduceByKey(_ + _)

How to process a range of hbase rows using spark?

Pyspark: how to duplicate a row n time in dataframe?

python pyspark bigdata

In spark join, does table order matter like in pig?

Creating a comparable and flexible fingerprint of an object

Number of reducers in hadoop

Is Spark's KMeans unable to handle bigdata?

Moving from Relational Database to Big Data

What format do sites like Facebook use to store data for personal profiles?

Where is Apache Kafka placed in the PACELC-Theorem

Hbase FuzzyRowFilter how jumping of keys work

hbase bigdata hfile

What are the limitations of implementing MySQL NDB Cluster?

SolrException Plugin init failure for [schema.xml] fieldType "pint": Error loading class 'solr.IntField'

sorting large text data

python sorting bigdata

Can Mongo config servers have different user privilages in each of them?

mongodb bigdata