39 тысяч подписчиков
698 видео
Engineering Fast Indexes for Big Data Applications: Spark Summit East talk by Daniel Lemire
First Steps With Spark - Spark Screencast #1
Tagging and Processing Data in Real... Hari Shreedharan (Cloudera) Siddhartha Jain (Salesforce Inc)
Building Realtime Data Pipelines with Kafka Connect and Spark Streaming
FIS: Accelerating Digital Intelligence in FinTech: Spark Summit East talk by Aaron Colcord
Building Real Time BI Systems with Kafka, Spark & Kudu: Spark Summit East talk by Ruhollah Farchtchi
Spark Streaming Pushing the Throughout Limits, the Reactive Way
Fusing Apache Spark and Lucene for Near Realtime Predictive Model Building (Deb Das)
Using Spark and Elasticsearch for Real-time Data Analysis- Costin Leau (Elasticsearch)
Optimizing Spark Deployments for Containers: Isolation, Safety & Performance by William Benton
Magellan: Spark as a Geospatial Analytics Engine
Spark and Cassandra - Martin Van Ryswyk
Lessons Learned from Dockerizing Spark Workloads: Spark Summit East talk by Tom Phelan
Optimizing Apache Spark SQL Joins: Spark Summit East talk by Vida Ha
Lambda Architecture, Analytics and Data Pathways with Spark Streaming, Kafka, Akka and Cassandra
The Fast Path to Building Operational Applications with Spark: talk by Nikita Shamgunov
Building a modern data discovery and BI platform using Apache Spark and Catalyst with Kevin Beyer
Accelerating Machine Learning and Deep Learning At Scale With Apache Spark: talk by Ziya Ma
EclairJS = Node Js + Apache Spark
SparkSQL: A Compiler from Queries to RDDs: Spark Summit East talk by Sameer Agarwal
Machine Learning on Spark RoundTable
Spark Streaming and IoT
Clickstream Analysis with Spark—Understanding Visitors in Realtime
Debugging PySpark: Spark Summit East talk by Holden Karau
Building, Debugging, and Tuning Spark Machine Learning Pipelines - Joseph Bradley (Databricks)
Extending Spark with Java Agents (Jaroslav Bachorik)
Scientific Image Analysis Using Spark- Kevin Mader (ETH Zurich / Paul Scherrer Institut)
Using Spark and Riak for IoT Apps—Patterns and Anti Patterns: Spark Summit East talk by Pavel Hardak
Apache Toree: A Jupyter Kernel for Spark: Spark Summit East talk by Marius van Niekerk
Top 5 Mistakes When Writing Spark Applications
Spark on large Hadoop cluster and evaluation - Masaru Dobashi (NTT Data Corporation)
Lessons Learned From Running Spark On Docker
Why Spark on Hadoop Matters - M.C. Srivas
Interactive Visualization of Streaming Data Powered by Spark
Deep Dive: Apache Spark Memory Management
Deep Dive into Project Tungsten Bringing Spark Closer to Bare Metal -Josh Rosen (Databricks)
Deep Dive Into Catalyst: Apache Spark 2 0'S Optimizer
Apache Spark 2.0: A Deep Dive Into Structured Streaming - by Tathagata Das
Spark on YARN: a Deep Dive - Sandy Ryza (Cloudera)
New Directions in pySpark for Time Series Analysis: Spark Summit East talk by David Palaitis
A Deep Dive into the Catalyst Optimizer Hands on Lab (Herman van Hovell)
Accelerating Spark Genome Sequencing in Cloud—A Data Driven Approach by Eric Kaczmarek and Lucy Lu
Fighting Cybercrime: A Joint Task Force of Real Time Data and Human Analytics by William Callaghan
Keeping Spark on Track: Productionizing Spark for ETL: talk by Kyle Pistor and Miklos Christine
Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East talk by Jose Soltren
Spark: Data Science as a Service: Spark Summit East talk by Shekhar Agrawal and Sridhar Alla
Secured Kerberos based Spark Notebook for Data Science: Spark Summit East talk by Joy Chakraborty
Understanding Memory Management In Spark For Fun And Profit
Data Profiling and Pipeline Processing with Spark
Building Realtime Data Pipelines with Kafka Connect & Spark Streaming by Ewen Cheslack-Postava