Skip to content

henning.kropponline.de

Big Data by Passion

Twitter
LinkedIn

Home

Tag: writable

Custom MATLAB InputFormat for Apache Spark

Hadoop supports multiple file formats as input for MapReduce workflows, including programs executed with Apache Spark. Defining custom InputFormats is a common practice among Hadoop Data Engineers and will be discussed here based on publicly available data set.

The approach demonstrated in this post does not provide means for a general MATLAB™ InputFormat for Hadoop. This would require significant effort in finding a general purpose mapping of MATLAB™’s file format and type system to the ones of HDFS. Continue reading “Custom MATLAB InputFormat for Apache Spark” →

hkropp Hadoop, Spark, Uncategorized Leave a comment October 23, 2016 5 Minutes

Top Posts & Pages

A Secure HDFS Client Example
Using Hive from R with JDBC
Services and State with Ambari REST API
Installing HttpFS Gateway on a Kerberized Cluster
Simple Spark Streaming & Kafka Example in a Zeppelin Notebook
Distcp between two HA Cluster
Spark Streaming with Kafka & HBase Example
Hadoop File Ingest and Hive
Install HDP with Red Hat Satellite
Broadcast Join with Spark

Tags

Actor Ambari ansible API argus azure blueprint breeze breeze-viz cassandra Closure conda dataworks Docker enterprise computing fitbit flume graphx gss-api ha Hadoop HDP hive jaas Kafka kerberos puppet pyspark Python Scala security Streaming vagrant Virtualenv YARN

On Twitter

Categories

Archives

Blog at WordPress.com.

Subscribe Subscribed
- henning.kropponline.de
- Already have a WordPress.com account? Log in now.