Tag Archives: data science

Highlights from Spark + AI Summit 2018 (SAIS 2018)

Are you into cluster-computing with Apache Spark? This year’s SAIS 2018 conference covered great data engineering and data science best practices for productionizing AI. In a nutshell, you should keep your training data fresh with stream processing, monitor quality, test and serve models (at massive scale when talking about Spark). The conference also provided some deep dive sessions on Spark integration with popular machine learning frameworks, such as well known TensorFlow, Scikit-learn, Keras, PyTorch, DeppLearning4j, BigDL and Deep Learning Pipelines.

Here is the list of several interesting topics (in case you couldn’t join;-):

Spark Experience and Use Cases

CERN’s Next Generation Data Analysis Platform with Apache Spark

Great talk about Spark utilization for HEP (high energy psysics) data processing and analysis as a complementary tool for current rid computing in CERN.

Continue reading

Advertisements

Wanted: Machine learning expert for manufacturing projects

This year we started to work on advanced analytical projects in manufacturing. The boom of IoT sensors, never-ending pressure to increase yields and output quality, decreasing marginal effect of lean and Six Sigma activities and the big trend of analytics caused that we quickly ran out of our existing capacities. The projects are intriguing, data are large, we are fun to work with and the demand is enormous. Honestly, I don’t see any reason why not to join us!

Continue reading

Infrastructure and Development for Data Science

Coming from a classical IT background in terms of software development it took us a while to arrive at an architecture that was capable of fulfilling our needs for Data Science projects. Be aware that treating these two in a similar matter is not a good idea, as you might seriously lower the productivity of your Data Science team.

Continue reading