Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

New posts in bigdata

What is the best way to load huge result set in memory?

c# ado.net bigdata datareader

NumPy: 3-byte, 6-byte types (aka uint24, uint48)

python numpy bigdata

NoSQL or RDBMS for audit data

Is there a good way to avoid memory deep copy or to reduce time spent in multiprocessing?

Social-networking: Hadoop, HBase, Spark over MongoDB or Postgres?

What is the difference between broadcast_address and broadcast_rpc_address in cassandra.yaml?

cassandra bigdata

Getting exception : java.lang.NoSuchMethodError: scala.reflect.api.JavaUniverse.runtimeMirror(Ljava/lang/ClassLoader;) while using data frames

Representation of Large Graph with 100 million nodes in C++

c++ vector graph bigdata

How multiple consumer group consumers work across partition on the same topic in Kafka?

apache-kafka bigdata

Spark DataFrame limit function takes too much time to show

Removing first line of Big CSV file?

python csv python-3.x bigdata

Where does Big Data go and how is it stored?

database hadoop bigdata nosql

Apache Spark architecture

apache-spark hdfs bigdata

How to find if a folder exists in hadoop or not?

shell hadoop bigdata

Query Failed Error: Resources exceeded during query execution: The query could not be executed in the allotted memory

Improving the distribution of hash function values

hash bigdata

External shuffle: shuffling large amount of data out of memory

java algorithm bigdata

How to use NOT IN in Hive

hadoop hive bigdata

How can I debug a pig script

hadoop apache-pig bigdata

Difference between shuffle() and rebalance() in Apache Flink