Skip to content

henning.kropponline.de

Big Data by Passion

Twitter
LinkedIn

Home

Tag: inputformat

Custom MATLAB InputFormat for Apache Spark

Hadoop supports multiple file formats as input for MapReduce workflows, including programs executed with Apache Spark. Defining custom InputFormats is a common practice among Hadoop Data Engineers and will be discussed here based on publicly available data set.

The approach demonstrated in this post does not provide means for a general MATLAB™ InputFormat for Hadoop. This would require significant effort in finding a general purpose mapping of MATLAB™’s file format and type system to the ones of HDFS. Continue reading “Custom MATLAB InputFormat for Apache Spark” →

hkropp Hadoop, Spark, Uncategorized Leave a comment October 23, 2016 5 Minutes

Top Posts & Pages

A Secure HDFS Client Example
Using Hive from R with JDBC
Connecting Livy to a Secured Kerberized HDP Cluster
Collection of HDP Vagrant Scripts
Web Scraping with JDK 8 ScriptEngine (Nashorn) and Scala
Iron Blogger: In for a Perfect Game
San Diego Brewery Field Trip
Responsive D3.js Modules with AngularJS
Apache Knox: A Hadoop Bastion
HDFS Storage Tier - Archiving to Cloud w/ S3

Tags

Actor Ambari ansible API argus azure blueprint breeze breeze-viz cassandra Closure conda dataworks Docker enterprise computing fitbit flume graphx gss-api ha Hadoop HDP hive jaas Kafka kerberos puppet pyspark Python Scala security Streaming vagrant Virtualenv YARN

On Twitter

Categories

Archives

Blog at WordPress.com.

Subscribe Subscribed
- henning.kropponline.de
- Already have a WordPress.com account? Log in now.