With the release of Scala 2.11 it became fully JSR-223 compliant scripting language for Java. JSR-223 is the community request to allow scripting language to have an interface to Java and to allow Java to use the scripting language inside of applications.
In a recent post I demonstrated how easy it is to connect to a REST API like the one of Fitbit with Scala to collect JSON data. Taking up the results of that post here, I would like to demonstrate how Apache Zeppelin can be used to also fetch but in the end visualize the data. Based on the once collected data Zeppelin allows to easily visualize the output through different graphs.
Apache Zeppelin itself is a notebook like, web-based data analytic tool with a specific focus on exploratory data analysis in modern BigData architectures supporting multiple interpreters like Tajo, Spark, Hive, HBase and more. Saying this, it is important to point out, that in this here described case only Scala is being used to display the received data. But this use case could easily be extended to include Apache Hive or Spark. Continue reading “Fitbit Visualization with Apache Zeppelin”
The developer API of Fitbit provides access to the data collected by it’s personal trackers for use with custom applications development. Besides read also write access can be used not only to it’s own but on behalf of other platform users via OAuth authentication. A comprehensive documentation of the Fitbit API can be found here: https://dev.fitbit.com/docs/ . Continue reading “Access to Fitbit API with Scala”
Data visualization is an integral part of data science. The programming language Scala has many characteristics that make it popular for data science use cases among other languages like R and Python. Immutable data structures and functional constructs are some of the features that make it so attractive to data scientists. Popular big data crunching frameworks like Spark or Flink do have their fair share on an ever growing ecosystem of tools and libraries for data analysis and engineering. Scala is particularly well suited to build robust libraries for scalable data analytics.
In this post we are going to introduce Breeze, a library for fast linear algebraic manipulation of data sets, together with tools for visualization and NLP. Starting with basic creation of vectors, we will create an application for plotting stock prices. The stock data is obtained form Yahoo Finance, but can also be downloaded here for SAP, YAHOO, BMW, and IBM. Continue reading “Plotting Graphs – Data Science with Scala”
Web scraping is the process of extracting entities from web pages. This entities can either be news articles, blog posts, products, or any other information displayed on the web. Web pages consist of HTML (a XML like structure) and therefor present information in a structured form, in a DOM (Document Object Model), which can be extracted. The structure is usually not very strict or complaint to other web pages and object to change as the web page evolves. Using scripts over compiled programming languages has therefor a lot of advantages. On the other hand compiled languages have often a speed advantage as well as a broad foundation of fundamental libraries. The DOM reference implementation of the W3C is written in Java for example.
Java’s approach of opening up the JVM (Java Virtual Machine) for other (dynamically typed) languages is a great way to combine the benefits of both, a compiled and scripted language. With JDK 8 compiling dynamic languages to the JVM has become simpler with potentially improved implementations of compilers and runtime systems through the invokedynamic instruction. I found that this presentation currently explains best what invokedynamic is and how it works in general.
Continue reading “Web Scraping with JDK 8 ScriptEngine (Nashorn) and Scala”
Traits are Scala’s way of implementing mixins, which means to combine methods of other classes. It’s a powerful tool and an extension to the Java object model. You can think of Traits as Interfaces with concrete implementations.
The Actor model is a robust concurrency abstraction model that was introduced 1973 by Carl Hewitt, Peter Bishop, and Richard Steiger. A number of programming languages since then have adopted this model, probably most notably is the programming language Erlang. Since Erlang is widely used by the Telecommunications Industry or Facebook Chat the Actor model has already become the foundation of some of our day-to-day communication.
In this post I would like to give a brief overview of this two concepts applied in Scala by also providing some examples.
Continue reading “Scala: Traits & Actors”