Ebook1,134 pages9 hours

Clojure for Data Science

Name: Clojure for Data Science
Author: Garner Henry
ISBN: 9781784397500

By Garner Henry

Rating: 0 out of 5 stars

()

Read preview

About this ebook

This book is aimed at developers who are already productive in Clojure but who are overwhelmed by the breadth and depth of understanding required to be effective in the field of data science. Whether you're tasked with delivering a specific analytics project or simply suspect that you could be deriving more value from your data, this book will inspire you with the opportunities—and inform you of the risks—that exist in data of all shapes and sizes.

Skip carousel

Enterprise Applications

LanguageEnglish

PublisherPackt Publishing

Release dateSep 3, 2015

ISBN9781784397500

Author

Garner Henry

Related authors

Skip carousel

Related to Clojure for Data Science

Related ebooks

Skip carousel

Scala Data Analysis Cookbook
Ebook
Scala Data Analysis Cookbook
byManivannan Arun
Rating: 0 out of 5 stars
0 ratings
Julia for Data Science
Ebook
Julia for Data Science
byAnshul Joshi
Rating: 0 out of 5 stars
0 ratings
Clojure Web Development Essentials
Ebook
Clojure Web Development Essentials
byRyan Baldwin
Rating: 0 out of 5 stars
0 ratings
Clojure High Performance Programming - Second Edition
Ebook
Clojure High Performance Programming - Second Edition
byKumar Shantanu
Rating: 0 out of 5 stars
0 ratings
Clojure Data Analysis Cookbook - Second Edition
Ebook
Clojure Data Analysis Cookbook - Second Edition
byEric Rochester
Rating: 0 out of 5 stars
0 ratings
Mastering Clojure
Ebook
Mastering Clojure
byWali Akhil
Rating: 0 out of 5 stars
0 ratings
Learn ClojureScript: Functional programming for the web
Ebook
Learn ClojureScript: Functional programming for the web
byAndrew Meredith
Rating: 0 out of 5 stars
0 ratings
Clojure Data Structures and Algorithms Cookbook
Ebook
Clojure Data Structures and Algorithms Cookbook
byRafik Naccache
Rating: 4 out of 5 stars
4/5
The Clojure Workshop: Use functional programming to build data-centric applications with Clojure and ClojureScript
Ebook
The Clojure Workshop: Use functional programming to build data-centric applications with Clojure and ClojureScript
byJoseph Fahey
Rating: 0 out of 5 stars
0 ratings
Clojure Programming Cookbook
Ebook
Clojure Programming Cookbook
byMakoto Hashimoto
Rating: 0 out of 5 stars
0 ratings
The Right to Read
Ebook
The Right to Read
byRichard M. Stallman
Rating: 0 out of 5 stars
0 ratings
Clojure for Java Developers
Ebook
Clojure for Java Developers
byDíaz Eduardo
Rating: 0 out of 5 stars
0 ratings
Clojure Reactive Programming
Ebook
Clojure Reactive Programming
byLeonardo Borges
Rating: 0 out of 5 stars
0 ratings
Learning ClojureScript
Ebook
Learning ClojureScript
byRafik Naccache
Rating: 0 out of 5 stars
0 ratings
Functional Programming in Scala
Ebook
Functional Programming in Scala
byPaul Chiusano
Rating: 4 out of 5 stars
4/5
Haskell Design Patterns
Ebook
Haskell Design Patterns
byLemmer Ryan
Rating: 0 out of 5 stars
0 ratings
Natural Language Processing with Java
Ebook
Natural Language Processing with Java
byRichard M Reese
Rating: 0 out of 5 stars
0 ratings
Xamarin in Action: Creating native cross-platform mobile apps
Ebook
Xamarin in Action: Creating native cross-platform mobile apps
byJim Bennett
Rating: 0 out of 5 stars
0 ratings
Synthetic data Third Edition
Ebook
Synthetic data Third Edition
byGerardus Blokdyk
Rating: 0 out of 5 stars
0 ratings
Clojure in Action
Ebook
Clojure in Action
byAmit Rathore
Rating: 0 out of 5 stars
0 ratings
Mastering F#
Ebook
Mastering F#
byAlfonso García-Caro Núñez
Rating: 5 out of 5 stars
5/5
The Golden Ticket: P, NP, and the Search for the Impossible
Ebook
The Golden Ticket: P, NP, and the Search for the Impossible
byLance Fortnow
Rating: 4 out of 5 stars
4/5
Learning Heroku Postgres
Ebook
Learning Heroku Postgres
byPatrick Espake
Rating: 0 out of 5 stars
0 ratings
Hands-On Data Structures and Algorithms with Rust: Learn programming techniques to build effective, maintainable, and readable code in Rust 2018
Ebook
Hands-On Data Structures and Algorithms with Rust: Learn programming techniques to build effective, maintainable, and readable code in Rust 2018
byClaus Matzinger
Rating: 0 out of 5 stars
0 ratings
Mastering Transformers: The Journey from BERT to Large Language Models and Stable Diffusion
Ebook
Mastering Transformers: The Journey from BERT to Large Language Models and Stable Diffusion
bySavaş Yıldırım
Rating: 0 out of 5 stars
0 ratings
Julia Cookbook
Ebook
Julia Cookbook
byJalem Raj Rohit
Rating: 0 out of 5 stars
0 ratings
Technical Program Manager's Handbook: Empowering managers to efficiently manage technical projects and build a successful career path
Ebook
Technical Program Manager's Handbook: Empowering managers to efficiently manage technical projects and build a successful career path
byJoshua Alan Teter
Rating: 0 out of 5 stars
0 ratings
Build an Orchestrator in Go (From Scratch)
Ebook
Build an Orchestrator in Go (From Scratch)
byTim Boring
Rating: 0 out of 5 stars
0 ratings
Geometrical Solutions Derived from Mechanics; a Treatise of Archimedes
Ebook
Geometrical Solutions Derived from Mechanics; a Treatise of Archimedes
byLydia Gillingham Robinson
Rating: 0 out of 5 stars
0 ratings
Scientific Computing with Scala
Ebook
Scientific Computing with Scala
byVytautas Jančauskas
Rating: 0 out of 5 stars
0 ratings

Enterprise Applications For You

Skip carousel

Creating Online Courses with ChatGPT | A Step-by-Step Guide with Prompt Templates
Ebook
Creating Online Courses with ChatGPT | A Step-by-Step Guide with Prompt Templates
byCea West
Rating: 4 out of 5 stars
4/5
Product Operations: How successful companies build better products at scale
Ebook
Product Operations: How successful companies build better products at scale
byMelissa Perri
Rating: 0 out of 5 stars
0 ratings
Notion for Beginners: Notion for Work, Play, and Productivity
Ebook
Notion for Beginners: Notion for Work, Play, and Productivity
byJill Hamilton
Rating: 4 out of 5 stars
4/5
Lean Management for Beginners: Fundamentals of Lean Management for Small and Medium-Sized Enterprises - With many Practical Examples
Ebook
Lean Management for Beginners: Fundamentals of Lean Management for Small and Medium-Sized Enterprises - With many Practical Examples
byMaximilian Tündermann
Rating: 0 out of 5 stars
0 ratings
Agile Project Management: Scrum for Beginners
Ebook
Agile Project Management: Scrum for Beginners
byMarkus Heimrath
Rating: 4 out of 5 stars
4/5
Beginner's Guide to the Obsidian Note Taking App and Second Brain: Everything you Need to Know About the Obsidian Software with 70+ Screenshots to Guide you
Ebook
Beginner's Guide to the Obsidian Note Taking App and Second Brain: Everything you Need to Know About the Obsidian Software with 70+ Screenshots to Guide you
byMarc A. Palmer
Rating: 5 out of 5 stars
5/5
ChatGPT Ultimate User Guide - How to Make Money Online Faster and More Precise Using AI Technology
Ebook
ChatGPT Ultimate User Guide - How to Make Money Online Faster and More Precise Using AI Technology
byMaximus Wilson
Rating: 1 out of 5 stars
1/5
Bitcoin For Dummies
Ebook
Bitcoin For Dummies
byPrypto
Rating: 4 out of 5 stars
4/5
Mastering ChatGPT: Create Highly Effective Prompts, Strategies, and Best Practices to Go From Novice to Expert
Ebook
Mastering ChatGPT: Create Highly Effective Prompts, Strategies, and Best Practices to Go From Novice to Expert
byTJ Books
Rating: 3 out of 5 stars
3/5
Excel for Beginners 2023: A Step-by-Step and Quick Reference Guide to Master the Fundamentals, Formulas, Functions, & Charts in Excel with Practical Examples | A Complete Excel Shortcuts Cheat Sheet
Ebook
Excel for Beginners 2023: A Step-by-Step and Quick Reference Guide to Master the Fundamentals, Formulas, Functions, & Charts in Excel with Practical Examples | A Complete Excel Shortcuts Cheat Sheet
byJames H. Moyle
Rating: 0 out of 5 stars
0 ratings
Trend Following: Learn to Make a Fortune in Both Bull and Bear Markets
Ebook
Trend Following: Learn to Make a Fortune in Both Bull and Bear Markets
byMatthew G. Carter
Rating: 5 out of 5 stars
5/5
Learn Power BI: A beginner's guide to developing interactive business intelligence solutions using Microsoft Power BI
Ebook
Learn Power BI: A beginner's guide to developing interactive business intelligence solutions using Microsoft Power BI
byGreg Deckler
Rating: 5 out of 5 stars
5/5
Microsoft Power Platform Up and Running: Learn to Analyze Data, Create Solutions, Automate Processes, and Develop Virtual Agents with Low Code Programming (English Edition)
Ebook
Microsoft Power Platform Up and Running: Learn to Analyze Data, Create Solutions, Automate Processes, and Develop Virtual Agents with Low Code Programming (English Edition)
byRobert Rybaric
Rating: 5 out of 5 stars
5/5
Low-Code/No-Code: Citizen Developers and the Surprising Future of Business Applications
Ebook
Low-Code/No-Code: Citizen Developers and the Surprising Future of Business Applications
byPhil Simon
Rating: 3 out of 5 stars
3/5
Excel VBA Programming For Dummies
Ebook
Excel VBA Programming For Dummies
byJohn Walkenbach
Rating: 4 out of 5 stars
4/5
Change Management for Beginners: Understanding Change Processes and Actively Shaping Them
Ebook
Change Management for Beginners: Understanding Change Processes and Actively Shaping Them
bySteffen Lobinger
Rating: 5 out of 5 stars
5/5
Microsoft Office 365 Bible: 10:1 Mastery | Excel in Your Profession, Enhance Time Management, and Foster Exceptional Collaboration [III EDITION]: Career Elevator
Ebook
Microsoft Office 365 Bible: 10:1 Mastery | Excel in Your Profession, Enhance Time Management, and Foster Exceptional Collaboration [III EDITION]: Career Elevator
byKevin Pitch
Rating: 5 out of 5 stars
5/5
Learning Python
Ebook
Learning Python
byFabrizio Romano
Rating: 5 out of 5 stars
5/5
Excel : The Ultimate Comprehensive Step-By-Step Guide to the Basics of Excel Programming: 1
Ebook
Excel : The Ultimate Comprehensive Step-By-Step Guide to the Basics of Excel Programming: 1
byKevin Clark
Rating: 5 out of 5 stars
5/5
Learn MongoDB in 24 Hours
Ebook
Learn MongoDB in 24 Hours
byAlex Nordeen
Rating: 5 out of 5 stars
5/5
Logseq for Students: Super Powered Outliner Notebook for Learning with Confidence
Ebook
Logseq for Students: Super Powered Outliner Notebook for Learning with Confidence
byJeremy P. Jones
Rating: 5 out of 5 stars
5/5
React Projects: Build 12 real-world applications from scratch using React, React Native, and React 360
Ebook
React Projects: Build 12 real-world applications from scratch using React, React Native, and React 360
byRoy Derks
Rating: 0 out of 5 stars
0 ratings
Learn SAP Basis in 24 Hours
Ebook
Learn SAP Basis in 24 Hours
byAlex Nordeen
Rating: 5 out of 5 stars
5/5
Enterprise AI For Dummies
Ebook
Enterprise AI For Dummies
byZachary Jarvinen
Rating: 3 out of 5 stars
3/5
Microsoft Power Platform A Deep Dive: Dig into Power Apps, Power Automate, Power BI, and Power Virtual Agents (English Edition)
Ebook
Microsoft Power Platform A Deep Dive: Dig into Power Apps, Power Automate, Power BI, and Power Virtual Agents (English Edition)
byBijay Kumar Sahoo
Rating: 0 out of 5 stars
0 ratings
Hands-On Financial Modeling with Microsoft Excel 2019: Build practical models for forecasting, valuation, trading, and growth analysis using Excel 2019
Ebook
Hands-On Financial Modeling with Microsoft Excel 2019: Build practical models for forecasting, valuation, trading, and growth analysis using Excel 2019
byShmuel Oluwa
Rating: 5 out of 5 stars
5/5
Learn SAP MM in 24 Hours
Ebook
Learn SAP MM in 24 Hours
byAlex Nordeen
Rating: 0 out of 5 stars
0 ratings
101 Most Popular Excel Formulas: 101 Excel Series, #1
Ebook
101 Most Popular Excel Formulas: 101 Excel Series, #1
byJohn Michaloudis
Rating: 4 out of 5 stars
4/5
Financial Modelling in Power BI: Forecasting Business Intelligently
Ebook
Financial Modelling in Power BI: Forecasting Business Intelligently
byJonathan Liau
Rating: 5 out of 5 stars
5/5
ChatGPT Side Hustles 2024 - Unlock the Digital Goldmine and Get AI Working for You Fast with More Than 85 Side Hustle Ideas to Boost Passive Income, Create New Cash Flow, and Get Ahead of the Curve
Ebook
ChatGPT Side Hustles 2024 - Unlock the Digital Goldmine and Get AI Working for You Fast with More Than 85 Side Hustle Ideas to Boost Passive Income, Create New Cash Flow, and Get Ahead of the Curve
byAlec Rowe
Rating: 0 out of 5 stars
0 ratings

Related podcast episodes

Skip carousel

#70 Beyond the Language Wars: R & Python for the Modern Data Scientist
Podcast episode
#70 Beyond the Language Wars: R & Python for the Modern Data Scientist
byDataFramed
0 ratings
0% found this document useful
Fluent Python revisited: Ahead of the release of the second edition of his landmark book, Fluent Python, our team catch up with author Luciano Ramalho to hear about what’s happening in the world of Python — and why it’s popularity continues to endure.
Podcast episode
Fluent Python revisited: Ahead of the release of the second edition of his landmark book, Fluent Python, our team catch up with author Luciano Ramalho to hear about what’s happening in the world of Python — and why it’s popularity continues to endure.
byThoughtworks Technology Podcast
0 ratings
0% found this document useful
A Chaos Engineering & Jeli Sandwich with Nora Jones: Nora Jones is the founder and CEO at Jeli, makers of an incident analysis platform that leverages data to recommend productive solutions to the problems at hand. Before this role, she was Head of Chaos Engineering and Human Factors at Slack, a senior soft
Podcast episode
A Chaos Engineering & Jeli Sandwich with Nora Jones: Nora Jones is the founder and CEO at Jeli, makers of an incident analysis platform that leverages data to recommend productive solutions to the problems at hand. Before this role, she was Head of Chaos Engineering and Human Factors at Slack, a senior soft
byScreaming in the Cloud
0 ratings
0% found this document useful
Generative models: exploration to deployment: get Fully-Connected with Chris & Daniel
Podcast episode
Generative models: exploration to deployment: get Fully-Connected with Chris & Daniel
byPractical AI: Machine Learning, Data Science, LLM
100%
100% found this document useful
181: Boost Your Django DX - Adam Johnson: We talk with Adam Johnson about his new book, "Boost Your Django DX".
Podcast episode
181: Boost Your Django DX - Adam Johnson: We talk with Adam Johnson about his new book, "Boost Your Django DX".
byTest and Code
0 ratings
0% found this document useful
235: Pair programming with Ben Orenstein & Tuple: In this episode, Kaushik goes solo and interviews Ben Orenstein. Ben is a prolific Ruby developer, an amazing conference speaker, an ardent vim-ster, and now the CEO of Tuple. Kaushik has been a big fan of Ben's work and was super stoked to talk to Ben and pick his brains on a host of topics: starting the company Tuple, pair programming in general, learning different programming languages and technology, giving better conference talks and more! This episode is chock full of wisdom from Ben. Enjoy!
Podcast episode
235: Pair programming with Ben Orenstein & Tuple: In this episode, Kaushik goes solo and interviews Ben Orenstein. Ben is a prolific Ruby developer, an amazing conference speaker, an ardent vim-ster, and now the CEO of Tuple. Kaushik has been a big fan of Ben's work and was super stoked to talk to Ben and pick his brains on a host of topics: starting the company Tuple, pair programming in general, learning different programming languages and technology, giving better conference talks and more! This episode is chock full of wisdom from Ben. Enjoy!
byFragmented - Android Developer Podcast
0 ratings
0% found this document useful
AI Today Podcast #114: Patterns of AI – Predictive Analytics / Decision Support: Patterns of AI: Predictive Analytics / Decision Support
Podcast episode
AI Today Podcast #114: Patterns of AI – Predictive Analytics / Decision Support: Patterns of AI: Predictive Analytics / Decision Support
byAI Today Podcast: Artificial Intelligence Insights, Experts, and Opinion
0 ratings
0% found this document useful
Preparing for System Design Interview
Podcast episode
Preparing for System Design Interview
byContinuous improvement
0 ratings
0% found this document useful
Past, Present and Future of C++ with Bjarne Stroustrup: Rob and Jason are joined by Bjarne Stroustrup, designer and original implementer of C++ to discuss the current state of C++, his vision for the future as well as some discussion of the past. Bjarne Stroustrup is the designer and original implementer...
Podcast episode
Past, Present and Future of C++ with Bjarne Stroustrup: Rob and Jason are joined by Bjarne Stroustrup, designer and original implementer of C++ to discuss the current state of C++, his vision for the future as well as some discussion of the past. Bjarne Stroustrup is the designer and original implementer...
byCppCast
0 ratings
0% found this document useful
Test Driven Development is Design - The Last Word on TDD: Scott Hanselman talks to Scott Bellware about TDD. ScottB says that Test Driven Development is less about Testing and more about Design. Is TDD poorly named? Did Test Smell beget Design Smell beget Code Smell?
Podcast episode
Test Driven Development is Design - The Last Word on TDD: Scott Hanselman talks to Scott Bellware about TDD. ScottB says that Test Driven Development is less about Testing and more about Design. Is TDD poorly named? Did Test Smell beget Design Smell beget Code Smell?
byHanselminutes with Scott Hanselman
0 ratings
0% found this document useful
Going with GraphQL: featuring Mark Sandstrom & Ben Kraft
Podcast episode
Going with GraphQL: featuring Mark Sandstrom & Ben Kraft
byGo Time: Golang, Software Engineering
0 ratings
0% found this document useful
Unlocking The Power of Data Lineage In Your Platform with OpenLineage: An interview with Julien Le Dem about the OpenLineage specification and the opportunity that it offers for simplifying the tracking and analysis of data lineage across your data platform.
Podcast episode
Unlocking The Power of Data Lineage In Your Platform with OpenLineage: An interview with Julien Le Dem about the OpenLineage specification and the opportunity that it offers for simplifying the tracking and analysis of data lineage across your data platform.
byData Engineering Podcast
0 ratings
0% found this document useful
Morgan Senkal: Using Epics to Improve Code Quality Within Sprints: Robby speaks with Morgan Senkal, Software Architect at Metal Toad. Morgan recalls a challenging 15-year-old legacy project that was reminiscent of a Stephen King story and explains what to think about when considering a software rewrite. Morgan and Robby keep a running analogy of technical debt and automotive repairs.
Podcast episode
Morgan Senkal: Using Epics to Improve Code Quality Within Sprints: Robby speaks with Morgan Senkal, Software Architect at Metal Toad. Morgan recalls a challenging 15-year-old legacy project that was reminiscent of a Stephen King story and explains what to think about when considering a software rewrite. Morgan and Robby keep a running analogy of technical debt and automotive repairs.
byMaintainable
0 ratings
0% found this document useful
The cathedral and the bazaar: Imagine you have two choices of how to build some…
Podcast episode
The cathedral and the bazaar: Imagine you have two choices of how to build some…
byLinear Digressions
0 ratings
0% found this document useful
Yugabyte and Database Innovations with Karthik Ranganathan: This week Corey is joined by Karthik Ranganathan, CTO and Co-Founder of Yugabyte, to talk about databases of which YugabyteDB is one of the best. Karthik started at Facebook building distributed databases and now has moved onto building even more! Why? We
Podcast episode
Yugabyte and Database Innovations with Karthik Ranganathan: This week Corey is joined by Karthik Ranganathan, CTO and Co-Founder of Yugabyte, to talk about databases of which YugabyteDB is one of the best. Karthik started at Facebook building distributed databases and now has moved onto building even more! Why? We
byScreaming in the Cloud
0 ratings
0% found this document useful
Create Interactive Maps & Geospatial Data Visualizations With Python
Podcast episode
Create Interactive Maps & Geospatial Data Visualizations With Python
byThe Real Python Podcast
0 ratings
0% found this document useful
The Rust Programming Language: with Steve Klabnik and Yehuda Katz
Podcast episode
The Rust Programming Language: with Steve Klabnik and Yehuda Katz
byThe Changelog: Software Development, Open Source
0 ratings
0% found this document useful
Being Bayesian: This episode explores the root concept of what it is to be Bayesian: describing knowledge of a system probabilistically, having an appropriate prior probability, know how to weigh new evidence, and following Bayes's rule to compute the revised...
Podcast episode
Being Bayesian: This episode explores the root concept of what it is to be Bayesian: describing knowledge of a system probabilistically, having an appropriate prior probability, know how to weigh new evidence, and following Bayes's rule to compute the revised...
byData Skeptic
0 ratings
0% found this document useful
#029 GPT-3, Prompt Engineering, Trading, AI Alignment, Intelligence
Podcast episode
#029 GPT-3, Prompt Engineering, Trading, AI Alignment, Intelligence
byMachine Learning Street Talk (MLST)
0 ratings
0% found this document useful
#176 - Acing the System Design Interview - Zhiyong Tan
Podcast episode
#176 - Acing the System Design Interview - Zhiyong Tan
byTech Lead Journal
0 ratings
0% found this document useful
#121 — ChatGPT and How Generative AI is Augmenting Workflows
Podcast episode
#121 — ChatGPT and How Generative AI is Augmenting Workflows
byDataFramed
0 ratings
0% found this document useful
GraphRAG: Knowledge Graphs for AI Applications with Kirk Marple - #681
Podcast episode
GraphRAG: Knowledge Graphs for AI Applications with Kirk Marple - #681
byThe TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
0 ratings
0% found this document useful
Can we predict the accuracy of a Neural Network? Yes, with the WeightWatcher tool by Charles Martin, Ph.D. - 002: In this episode we do a deep dive into deep neural networks. What conclusions can we make looking at the distribution of eigenvalues of each layer?
Podcast episode
Can we predict the accuracy of a Neural Network? Yes, with the WeightWatcher tool by Charles Martin, Ph.D. - 002: In this episode we do a deep dive into deep neural networks. What conclusions can we make looking at the distribution of eigenvalues of each layer?
byMachine Learning Cafe
0 ratings
0% found this document useful
It’s Not a Data Science Problem, It’s a Data Engineering Problem with Laurie Voss: Laurie Voss is a senior data analyst at Netlify, makers of a serverless platform designed to help teams build, deploy, and collaborate on web apps more effectively. Previously, Laurie worked as Chief Data Officer at npm, Inc., co-founded Snowball Factory,
Podcast episode
It’s Not a Data Science Problem, It’s a Data Engineering Problem with Laurie Voss: Laurie Voss is a senior data analyst at Netlify, makers of a serverless platform designed to help teams build, deploy, and collaborate on web apps more effectively. Previously, Laurie worked as Chief Data Officer at npm, Inc., co-founded Snowball Factory,
byScreaming in the Cloud
0 ratings
0% found this document useful
Serverless Code with Ryan Scott Brown: The unit of computation has evolved from on premise servers to virtual machines in the cloud to containers running in those virtual machines. Serverless computation is another stage in the evolution of computational unit management.
Podcast episode
Serverless Code with Ryan Scott Brown: The unit of computation has evolved from on premise servers to virtual machines in the cloud to containers running in those virtual machines. Serverless computation is another stage in the evolution of computational unit management.
byCloud Engineering Archives - Software Engineering Daily
0 ratings
0% found this document useful
Render.com with Anurag Goel: As cloud providers enable greater levels of specificity and control, they empower compliance-driven enterprise companies. This level of parameterization is downright inhospitable to a new software engineer and can be a cognitive barrier to entry for a...
Podcast episode
Render.com with Anurag Goel: As cloud providers enable greater levels of specificity and control, they empower compliance-driven enterprise companies. This level of parameterization is downright inhospitable to a new software engineer and can be a cognitive barrier to entry for a...
byCloud Engineering Archives - Software Engineering Daily
0 ratings
0% found this document useful
Python, Django, and Channels: with Andrew Godwin, creator of Django Channels
Podcast episode
Python, Django, and Channels: with Andrew Godwin, creator of Django Channels
byThe Changelog: Software Development, Open Source
0 ratings
0% found this document useful
#111 The Rise of the Julia Programming Language
Podcast episode
#111 The Rise of the Julia Programming Language
byDataFramed
0 ratings
0% found this document useful
JIT Compilation and Exascale Computing with Hal Finkel: Rob and Jason are joined by Hal Finkel from the US Department of Energy. They first talk to Hal about the LLVM 13 release and why the release notes were lacking. Then they talk to Hal about his C++ JIT Proposal, the Clang prototype and how it could be...
Podcast episode
JIT Compilation and Exascale Computing with Hal Finkel: Rob and Jason are joined by Hal Finkel from the US Department of Energy. They first talk to Hal about the LLVM 13 release and why the release notes were lacking. Then they talk to Hal about his C++ JIT Proposal, the Clang prototype and how it could be...
byCppCast
0 ratings
0% found this document useful
Localizing and Editing Knowledge in LLMs with Peter Hase - #679
Podcast episode
Localizing and Editing Knowledge in LLMs with Peter Hase - #679
byThe TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
0 ratings
0% found this document useful

Skip carousel

GPT-4 Might Just Be a Bloated, Pointless Mess
The Atlantic
Article
GPT-4 Might Just Be a Bloated, Pointless Mess
Mar 6, 2023
4 min read
A.I.-POWERED RASPBERRY Pi
Linux Format
Article
A.I.-POWERED RASPBERRY Pi
Sep 19, 2023
1 min read
How The FBI's Fake Cell Phone Company Put Criminals Into Real Jail Cells
NPR
Article
How The FBI's Fake Cell Phone Company Put Criminals Into Real Jail Cells
May 31, 2024
1 min read
VisionFive V1 RISC-V SBC on sale
Linux Format
Article
VisionFive V1 RISC-V SBC on sale
May 3, 2022
1 min read
The Verdict
Linux Format
Article
The Verdict
May 30, 2023
2 min read
ALGOL: The Father Of Mainstream Languages
Linux Format
Article
ALGOL: The Father Of Mainstream Languages
May 2, 2023
Mike Bedford hates to admit it, but he had his first experience with ALGOL before it fell out of favour, so getting to grips with it again was a trip down memory lane. I nthe latest instalment of our series on classic programming languages, we’re tur
11 min read
Innovation The Next Breakthrough Tech Coming From Apple Labs
AppleMagazine
Article
Innovation The Next Breakthrough Tech Coming From Apple Labs
Oct 14, 2022
4 min read
Step In To Your Time Machine
MacLife
Article
Step In To Your Time Machine
Sep 13, 2022
SAVED THE WRONG edit of a document? Time Machine! Spent hours on a project only to find someone else has saved an old version over it? Time Machine! Time Machine isn’t just a great way to retrace your steps. It’s an efficient backup app too; once set
3 min read
Metrics & Visuals In Go
Linux Format
Article
Metrics & Visuals In Go
Nov 17, 2020
Mihalis Tsoukalos is a DataOps engineer and a technical writer. He’s the author of Go Systems Programming and Mastering Go, 2nd edition. The subject of this tutorial is two-fold. First, it’s about creating a Go application that exports metrics to P
7 min read
The Coming Software Apocalypse
The Atlantic
Article
The Coming Software Apocalypse
Sep 26, 2017
33 min read
Speed And Efficiency
Linux Format
Article
Speed And Efficiency
Jun 28, 2022
Raspberry Pi OS was the lightest of the five distributions on test in this month’s Roundup. It took 29 seconds to boot into the desktop. At this point, we opened a terminal and typed code free-h, which showed that 266MB was in use. We then launched C
1 min read
Stream Films, TV And Music
Linux Format
Article
Stream Films, TV And Music
Oct 17, 2023
Once Jellyfin is set up, you’re taken to your home screen, which offers a variety of ways to view your media library: at the top are shortcuts to individual libraries, followed by the Next Up section, which helps you track your progress through TV sh
5 min read
The Existential Dread of Gmail's Auto-Complete Feature
The Atlantic
Article
The Existential Dread of Gmail's Auto-Complete Feature
Dec 14, 2018
4 min read
Use EBPF To Keep Tabs On Your CPU
Linux Format
Article
Use EBPF To Keep Tabs On Your CPU
Oct 18, 2022
Did you miss part one? Get hold of it on page 60 Mihalis Tsoukalos is a systems engineer and a technical writer. You can reach him at @mactsouk. We’re continuing our dive into the notoriously complex Extended Berkeley Packet Filter (eBPF) feature of
9 min read
The Great Resignation
Finweek - English
Article
The Great Resignation
Nov 25, 2021
Data by the US Bureau of Labor Statistics shows that 4m Americans quit their jobs in July 2021 with a peak in resignations in April. According to this data, there were 10.9m open positions at the end of July. Ian Cook and his team from Visier (a comp
1 min read
Eight Books That Explain How the World Works
The Atlantic
Article
Eight Books That Explain How the World Works
Nov 7, 2023
7 min read
LISP - Exploring The Original AI Language
Linux Format
Article
LISP - Exploring The Original AI Language
May 30, 2023
11 min read
The Self-Driving Car Is a Red Herring
Nautilus
Article
The Self-Driving Car Is a Red Herring
Oct 21, 2020
Ten years ago this fall, Google gave us a glimpse of a new device unlike any it had ever built before—a computer-controlled car. It seemed such a strange thing for an Internet company to spend its time and energy on, a “moonshot” as the company’s eng
10 min read
Create A RESTful Server In Go
Linux Format
Article
Create A RESTful Server In Go
Oct 19, 2021
8 min read
Usability
Linux Format
Article
Usability
Oct 19, 2021
3 min read
Dude Where’s My Flying Car?
Cosmos Magazine
Article
Dude Where’s My Flying Car?
Jun 2, 2021
Backed by billions in venture capital and a recent IPO, one startup aims to have its flying taxi on the market in three years. While they may be ahead of the pack, they’re not alone. Is this the decade we finally get our Jetsons car? The most obvious
5 min read
The Making Of Star Trek 25th Anniversary
Retro Gamer
Article
The Making Of Star Trek 25th Anniversary
May 9, 2024
8 min read
Deadmau5
Electronic Musician
Article
Deadmau5
Nov 22, 2022
13 min read
Are Neural Networks About to Reinvent Physics?
Nautilus
Article
Are Neural Networks About to Reinvent Physics?
Nov 21, 2019
Can AI teach itself the laws of physics? Will classical computers soon be replaced by deep neural networks? Sure looks like it, if you’ve been following the news, which lately has been filled with headlines like, “A neural net solves the three-body p
9 min read
How Image Recognition Works
APC
Article
How Image Recognition Works
Nov 4, 2019
4 min read
Amid Crypto’s Wild West, Binance Says A Sheriff Is Needed
AppleMagazine
Article
Amid Crypto’s Wild West, Binance Says A Sheriff Is Needed
Nov 19, 2021
4 min read
Us Senators Call Out Big Tech’s New Approach To Poaching Talent, Products From Smaller AI Startups
TechLife News
Article
Us Senators Call Out Big Tech’s New Approach To Poaching Talent, Products From Smaller AI Startups
Jul 20, 2024
4 min read
15 16 questions with… Lewis Thompson
Computer Music
Article
15 16 questions with… Lewis Thompson
Feb 22, 2023
7 min read
Don’t Be Misled by GPT-4’s Gift of Gab
The Atlantic
Article
Don’t Be Misled by GPT-4’s Gift of Gab
Mar 15, 2023
4 min read
Getting Started With The Powerful EBPF
Linux Format
Article
Getting Started With The Powerful EBPF
Sep 20, 2022
Credit: https://ebpf.io Don’t miss next issue! Subscribe on page 16 Mihalis Tsoukalos is a systems engineer and a technical writer. You can reach him at www. mtsoukalos.eu and @mactsouk. Get the code for this tutorial from the Linux Format archive:
10 min read

Related categories

Skip carousel

Reviews for Clojure for Data Science

Rating: 0 out of 5 stars

0 ratings

0 ratings0 reviews

Book preview

Clojure for Data Science - Garner Henry

Clojure for Data Science

Credits

About the Author

Acknowledgments

About the Reviewer

www.PacktPub.com

Support files, eBooks, discount offers, and more

Why subscribe?

Free access for Packt account holders

Preface

What this book covers

What you need for this book

Who this book is for

Conventions

Reader feedback

Customer support

Downloading the example code

Downloading the color images of this book

Errata

Piracy

Questions

1. Statistics

Downloading the sample code

Running the examples

Downloading the data

Inspecting the data

Data scrubbing

Descriptive statistics

The mean

Interpreting mathematical notation

The median

Variance

Quantiles

Binning data

Histograms

The normal distribution

The central limit theorem

Poincaré's baker

Generating distributions

Skewness

Quantile-quantile plots

Comparative visualizations

Box plots

Cumulative distribution functions

The importance of visualizations

Visualizing electorate data

Adding columns

Adding derived columns

Comparative visualizations of electorate data

Visualizing the Russian election data

Comparative visualizations

Probability mass functions

Scatter plots

Scatter transparency

Summary

2. Inference

Introducing AcmeContent

Download the sample code

Load and inspect the data

Visualizing the dwell times

The exponential distribution

The distribution of daily means

The central limit theorem

Standard error

Samples and populations

Confidence intervals

Sample comparisons

Bias

Visualizing different populations

Hypothesis testing

Significance

Testing a new site design

Performing a z-test

Student's t-distribution

Degrees of freedom

The t-statistic

Performing the t-test

Two-tailed tests

One-sample t-test

Resampling

Testing multiple designs

Calculating sample means

Multiple comparisons

Introducing the simulation

Compile the simulation

The browser simulation

jStat

Scalable Vector Graphics

Plotting probability densities

State and Reagent

Updating state

Binding the interface

Simulating multiple tests

The Bonferroni correction

Analysis of variance

The F-distribution

The F-statistic

The F-test

Effect size

Cohen's d

Summary

3. Correlation

About the data

Inspecting the data

Visualizing the data

The log-normal distribution

Visualizing correlation

Jittering

Covariance

Pearson's correlation

Sample r and population rho

Hypothesis testing

Confidence intervals

Regression

Linear equations

Residuals

Ordinary least squares

Slope and intercept

Interpretation

Visualization

Assumptions

Goodness-of-fit and R-square

Multiple linear regression

Matrices

Dimensions

Vectors

Construction

Addition and scalar multiplication

Matrix-vector multiplication

Matrix-matrix multiplication

Transposition

The identity matrix

Inversion

The normal equation

More features

Multiple R-squared

Adjusted R-squared

Incanter's linear model

The F-test of model significance

Categorical and dummy variables

Relative power

Collinearity

Multicollinearity

Prediction

The confidence interval of a prediction

Model scope

The final model

Summary

4. Classification

About the data

Inspecting the data

Comparisons with relative risk and odds

The standard error of a proportion

Estimation using bootstrapping

The binomial distribution

The standard error of a proportion formula

Significance testing proportions

Adjusting standard errors for large samples

Chi-squared multiple significance testing

Visualizing the categories

The chi-squared test

The chi-squared statistic

The chi-squared test

Classification with logistic regression

The sigmoid function

The logistic regression cost function

Parameter optimization with gradient descent

Gradient descent with Incanter

Convexity

Implementing logistic regression with Incanter

Creating a feature matrix

Evaluating the logistic regression classifier

The confusion matrix

The kappa statistic

Probability

Bayes theorem

Bayes theorem with multiple predictors

Naive Bayes classification

Implementing a naive Bayes classifier

Evaluating the naive Bayes classifier

Comparing the logistic regression and naive Bayes approaches

Decision trees

Information

Entropy

Information gain

Using information gain to identify the best predictor

Recursively building a decision tree

Using the decision tree for classification

Evaluating the decision tree classifier

Classification with clj-ml

Loading data with clj-ml

Building a decision tree in clj-ml

Bias and variance

Overfitting

Cross-validation

Addressing high bias

Ensemble learning and random forests

Bagging and boosting

Saving the classifier to a file

Summary

5. Big Data

Downloading the code and data

Inspecting the data

Counting the records

The reducers library

Parallel folds with reducers

Loading large files with iota

Creating a reducers processing pipeline

Curried reductions with reducers

Statistical folds with reducers

Associativity

Calculating the mean using fold

Calculating the variance using fold

Mathematical folds with Tesser

Calculating covariance with Tesser

Commutativity

Simple linear regression with Tesser

Calculating a correlation matrix

Multiple regression with gradient descent

The gradient descent update rule

The gradient descent learning rate

Feature scaling

Feature extraction

Creating a custom Tesser fold

Creating a matrix-sum fold

Calculating the total model error

Creating a matrix-mean fold

Applying a single step of gradient descent

Running iterative gradient descent

Scaling gradient descent with Hadoop

Gradient descent on Hadoop with Tesser and Parkour

Parkour distributed sources and sinks

Running a feature scale fold with Hadoop

Running gradient descent with Hadoop

Preparing our code for a Hadoop cluster

Building an uberjar

Submitting the uberjar to Hadoop

Stochastic gradient descent

Stochastic gradient descent with Parkour

Defining a mapper

Parkour shaping functions

Defining a reducer

Specifying Hadoop jobs with Parkour graph

Chaining mappers and reducers with Parkour graph

Summary

6. Clustering

Downloading the data

Extracting the data

Inspecting the data

Clustering text

Set-of-words and the Jaccard index

Tokenizing the Reuters files

Applying the Jaccard index to documents

The bag-of-words and Euclidean distance

Representing text as vectors

Creating a dictionary

Creating term frequency vectors

The vector space model and cosine distance

Removing stop words

Stemming

Clustering with k-means and Incanter

Clustering the Reuters documents

Better clustering with TF-IDF

Zipf's law

Calculating the TF-IDF weight

k-means clustering with TF-IDF

Better clustering with n-grams

Large-scale clustering with Mahout

Converting text documents to a sequence file

Using Parkour to create Mahout vectors

Creating distributed unique IDs

Distributed unique IDs with Hadoop

Sharing data with the distributed cache

Building Mahout vectors from input documents

Running k-means clustering with Mahout

Viewing k-means clustering results

Interpreting the clustered output

Cluster evaluation measures

Inter-cluster density

Intra-cluster density

Calculating the root mean square error with Parkour

Loading clustered points and centroids

Calculating the cluster RMSE

Determining optimal k with the elbow method

Determining optimal k with the Dunn index

Determining optimal k with the Davies-Bouldin index

The drawbacks of k-means

The Mahalanobis distance measure

The curse of dimensionality

Summary

7. Recommender Systems

Download the code and data

Inspect the data

Parse the data

Types of recommender systems

Collaborative filtering

Item-based and user-based recommenders

Slope One recommenders

Calculating the item differences

Making recommendations

Practical considerations for user and item recommenders

Building a user-based recommender with Mahout

k-nearest neighbors

Recommender evaluation with Mahout

Evaluating distance measures

The Pearson correlation similarity

Spearman's rank similarity

Determining optimum neighborhood size

Information retrieval statistics

Precision

Recall

Mahout's information retrieval evaluator

F-measure and the harmonic mean

Fall-out

Normalized discounted cumulative gain

Plotting the information retrieval results

Recommendation with Boolean preferences

Implicit versus explicit feedback

Probabilistic methods for large sets

Testing set membership with Bloom filters

Jaccard similarity for large sets with MinHash

Reducing pair comparisons with locality-sensitive hashing

Bucketing signatures

Dimensionality reduction

Plotting the Iris dataset

Principle component analysis

Singular value decomposition

Large-scale machine learning with Apache Spark and MLlib

Loading data with Sparkling

Mapping data

Distributed datasets and tuples

Filtering data

Persistence and caching

Machine learning on Spark with MLlib

Movie recommendations with alternating least squares

ALS with Spark and MLlib

Making predictions with ALS

Evaluating ALS

Calculating the sum of squared errors

Summary

8. Network Analysis

Download the data

Inspecting the data

Visualizing graphs with Loom

Graph traversal with Loom

The seven bridges of Königsberg

Breadth-first and depth-first search

Finding the shortest path

Minimum spanning trees

Subgraphs and connected components

SCC and the bow-tie structure of the web

Whole-graph analysis

Scale-free networks

Distributed graph computation with GraphX

Creating RDGs with Glittering

Measuring graph density with triangle counting

GraphX partitioning strategies

Running the built-in triangle counting algorithm

Implement triangle counting with Glittering

Step one – collecting neighbor IDs

Steps two, three, and four – aggregate messages

Step five – dividing the counts

Running the custom triangle counting algorithm

The Pregel API

Connected components with the Pregel API

Step one – map vertices

Steps two and three – the message function

Step four – update the attributes

Step five – iterate to convergence

Running connected components

Calculating the size of the largest connected component

Detecting communities with label propagation

Step one – map vertices

Step two – send the vertex attribute

Step three – aggregate value

Step four – vertex function

Step five – set the maximum iterations count

Running label propagation

Measuring community influence using PageRank

The flow formulation

Implementing PageRank with Glittering

Sort by highest influence

Running PageRank to determine community influencers

Summary

9. Time Series

About the data

Loading the Longley data

Fitting curves with a linear model

Time series decomposition

Inspecting the airline data

Visualizing the airline data

Stationarity

De-trending and differencing

Discrete time models

Random walks

Autoregressive models

Determining autocorrelation in AR models

Moving-average models

Determining autocorrelation in MA models

Combining the AR and MA models

Calculating partial autocorrelation

Autocovariance

PACF with Durbin-Levinson recursion

Plotting partial autocorrelation

Determining ARMA model order with ACF and PACF

ACF and PACF of airline data

Removing seasonality with differencing

Maximum likelihood estimation

Calculating the likelihood

Estimating the maximum likelihood

Nelder-Mead optimization with Apache Commons Math

Identifying better models with Akaike Information Criterion

Time series forecasting

Forecasting with Monte Carlo simulation

Summary

10. Visualization

Download the code and data

Exploratory data visualization

Representing a two-dimensional histogram

Using Quil for visualization

Drawing to the sketch window

Quil's coordinate system

Plotting the grid

Specifying the fill color

Color and fill

Outputting an image file

Visualization for communication

Visualizing wealth distribution

Bringing data to life with Quil

Drawing bars of differing widths

Adding a title and axis labels

Improving the clarity with illustrations

Adding text to the bars

Incorporating additional data

Drawing complex shapes

Drawing curves

Plotting compound charts

Output to PDF

Summary

Index

Clojure for Data Science

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

First published: September 2015

Production reference: 1280815

Published by Packt Publishing Ltd.

Livery Place

35 Livery Street

Birmingham B3 2PB, UK.

ISBN 978-1-78439-718-0

www.packtpub.com

Credits

Author

Henry Garner

Reviewer

Dan Hammer

Commissioning Editor

Ashwin Nair

Acquisition Editor

Meeta Rajani

Content Development Editor

Shubhangi Dhamgaye

Technical Editor

Shivani Kiran Mistry

Copy Editor

Akshata Lobo

Project Coordinator

Harshal Ved

Proofreader

Safis Editing

Indexer

Monica Ajmera Mehta

Graphics

Nicholas Garner

Disha Haria

Production Coordinator

Arvindkumar Gupta

Cover Work

Arvindkumar Gupta

About the Author

Henry Garner is a graduate from the University of Oxford and an experienced developer, CTO, and coach.

He started his technical career at Britain's largest telecoms provider, BT, working with a traditional data warehouse infrastructure. As a part of a small team for 3 years, he built sophisticated data models to derive insight from raw data and use web applications to present the results. These applications were used internally by senior executives and operatives to track both business and systems performance.

He then went on to co-found Likely, a social media analytics start-up. As the CTO, he set the technical direction, leading to the introduction of an event-based append-only data pipeline modeled after the Lambda architecture. He adopted Clojure in 2011 and led a hybrid team of programmers and data scientists, building content recommendation engines based on collaborative filtering and clustering techniques. He developed a syllabus and copresented a series of evening classes from Likely's offices for professional developers who wanted to learn Clojure.

Henry now works with growing businesses, consulting in both a development and technical leadership capacity. He presents regularly at seminars and Clojure meetups in and around London.

Acknowledgments

Thank you Shubhangi Dhamgaye, Meeta Rajani, Shivani Mistry, and the entire team at Packt for their help in bringing this project to fruition. Without you, this book would never have come to pass.

I'm grateful to Dan Hammer, my Packt reviewer, for his valuable perspective as a practicing data scientist, and to those other brave souls who patiently read through the very rough early (and not-so-early) drafts. Foremost among these are Éléonore Mayola, Paul Butcher, and Jeremy Hoyland. Your feedback was not always easy to hear, but it made the book so much better than it would otherwise have been.

Thank you to the wonderful team at MastodonC who tackled a pre-release version of this book in their company book club, especially Éléonore Mayola, Jase Bell, and Elise Huard. I'm grateful to Francine Bennett for her advice early on—which helped to shape the structure of the book—and also to Bruce Durling, Neale Swinnerton, and Chris Adams for their company during the otherwise lonely weekends spent writing in the office.

Thank you to my friends from the machine learning study group: Sam Joseph, Geoff Hogg, and Ben Taylor for reading the early drafts and providing feedback suitable for Clojure newcomers; and also to Luke Snape and Tom Coupland of the Bristol Clojurians for providing the opportunity to test the material out on its intended audience.

A heartfelt thanks to my dad, Nicholas, for interpreting my vague scribbles into the fantastic figures you see in this book, and to my mum, Jacqueline, and sister, Mary, for being such patient listeners in the times I felt like thinking aloud. Last, but by no means least, thank you to the Nuggets of Wynford Road, Russell and Wendy, for the tea and sympathy whenever it occasionally became a bit too much. I look forward to seeing much more of you both from now on.

About the Reviewer

Dan Hammer is a presidential innovation fellow working on Data Innovation initiatives at the NASA headquarters in the CTO's office. Dan is an economist and data scientist. He was the chief data scientist at the World Resources Institute, where he launched Global Forest Watch in partnership with Google, USAID, and many others. Dan is on leave from a PhD program at UC Berkeley, as advised by Max Auffhammer and George Judge. He teaches mathematics at the San Quentin State Prison as a lead instructor with the Prison University Project. Dan graduated with high honors in economics and mathematics from Swarthmore College, where he was a language scholar. He spent a full year building and racing Polynesian outrigger canoes in the South Pacific as a Watson Fellow. He has also reviewed Learning R for Geospatial Analysis by Packt Publishing.

Thanks to my wonderful wife Emily for suffering through my terrible jokes.

www.PacktPub.com

Support files, eBooks, discount offers, and more

For support files and downloads related to your book, please visit www.PacktPub.com.

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at for more details.

At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.

https://www2.packtpub.com/books/subscription/packtlib

Do you need instant solutions to your IT questions? PacktLib is Packt's online digital book library. Here, you can search, access, and read Packt's entire library of books.

Why subscribe?

Fully searchable across every book published by Packt

Copy and paste, print, and bookmark content

On demand and accessible via a web browser

Free access for Packt account holders

If you have an account with Packt at www.PacktPub.com, you can use this to access PacktLib today and view 9 entirely free books. Simply use your login credentials for immediate access.

For Helen.

You provided support, encouragement, and welcome distraction in roughly equal measure.

Preface

A web search for data science Venn diagram returns numerous interpretations of the skills required to be an effective data scientist (it appears that data science commentators love Venn diagrams). Author and data scientist Drew Conway produced the prototypical diagram back in 2010, putting data science at the intersection of hacking skills, substantive expertise (that is, subject domain understanding), and mathematics and statistics knowledge. Between hacking skills and substantive expertise—those practicing without strong mathematics and statistics knowledge—lies the danger zone.

Five years on, as a growing number of developers seek to plug the data science skills' shortage, there's more need than ever for statistical and mathematical education to help developers out of this danger zone. So, when Packt Publishing invited me to write a book on data science suitable for Clojure programmers, I gladly agreed. In addition to appreciating the need for such a book, I saw it as an opportunity to consolidate much of what I had learned as CTO of my own Clojure-based data analytics company. The result is the book I wish I had been able to read before starting out.

Clojure for Data Science aims to be much more than just a book of statistics for Clojure programmers. A large reason for the spread of data science into so many diverse areas is the enormous power of machine learning. Throughout the book, I'll show how to use pure Clojure functions and third-party libraries to construct machine learning models for the primary tasks of regression, classification, clustering, and recommendation.

Approaches that scale to very large datasets, so-called big data, are of particular interest to data scientists, because they can reveal subtleties that are lost in smaller samples. This book shows how Clojure can be used to concisely express jobs to run on the Hadoop and Spark distributed computation frameworks, and how to incorporate machine learning through the use of both dedicated external libraries and general optimization techniques.

Above all, this book aims to foster an understanding not just on how to perform particular types of analysis, but why such techniques work. In addition to providing practical knowledge (almost every concept in this book is expressed as a runnable example), I aim to explain the theory that will allow you to take a principle and apply it to related problems. I hope that this approach will enable you to effectively apply statistical thinking in diverse situations well into the future, whether or not you decide to pursue a career in data science.

What this book covers

Chapter 1, Statistics, introduces Incanter, Clojure's primary statistical computing library used throughout the book. With reference to the data from the elections in the United Kingdom and Russia, we demonstrate the use of summary statistics and the value of statistical distributions while showing a variety of comparative visualizations.

Chapter 2, Inference, covers the difference between samples and populations, and statistics and parameters. We introduce hypothesis testing as a formal means of determining whether the differences are significant in the context of A / B testing website designs. We also cover sample bias, effect size, and solutions to the problem of multiple testing.

Chapter 3, Correlation, shows how we can discover linear relationships between variables and use the relationship to make predictions about some variables given others. We implement linear regression—a machine learning algorithm—to predict the weights of Olympic swimmers given their heights, using only core Clojure functions. We then make our model more sophisticated using matrices and more data to improve its accuracy.

Chapter 4, Classification, describes how to implement several different types of machine learning algorithm (logistic regression, naive Bayes, C4.5, and random forests) to make predictions about the survival rates of passengers on the Titanic. We learn about another test for statistical significance that works for categories instead of continuous values, explain various issues you're likely to encounter while training machine learning models such as bias and overfitting, and demonstrate how to use the clj-ml machine learning library.

Chapter 5, Big Data, shows how Clojure can leverage the parallel capabilities in computers of all sizes using the reducers library, and how to scale up these techniques to clusters of machines on Hadoop with Tesser and Parkour. Using ZIP code level tax data from the IRS, we demonstrate how to perform statistical analysis and machine learning in a scalable way.

Chapter 6, Clustering, shows how to identify text documents that share similar subject matter using Hadoop and the Java machine learning library, Mahout. We describe a variety of techniques particular to text processing as well as more general concepts related to clustering. We also introduce some more advanced features of Parkour that can help get the best performance from your Hadoop jobs.

Chapter 7, Recommender Systems, covers a variety of different approaches to the challenge of recommendation. In addition to implementing a recommender with core Clojure functions, we tackle the ancillary challenge of dimensionality reduction by using principle component analysis and singular value decomposition, as well as probabilistic set compression using Bloom filters and the MinHash algorithm. Finally, we introduce the Sparkling and MLlib libraries for machine learning on the Spark distributed computation framework and use them to produce movie recommendations with alternating least squares.

Chapter 8, Network Analysis, shows a variety of ways of analyzing graph-structured data. We demonstrate the methods of traversal using the Loom library and then show how to use the Glittering and GraphX libraries with Spark to discover communities and influencers in social networks.

Chapter 9, Time Series, demonstrates how to fit curves to simple time series data. Using data on the monthly airline passenger counts, we show how to forecast future values for more complex series by training an autoregressive moving-average model. We do this by implementing a method of parameter optimization called maximum likelihood estimation with help from the Apache Commons Math library.

Chapter 10, Visualization, shows how the Clojure library Quil can be used to create custom visualizations for charts not provided by Incanter, and attractive graphics that can communicate findings clearly to your audience, whatever their background.

What you need for this book

The code for each chapter has been made available as a project on GitHub at https://github.com/clojuredatascience. The example code can be downloaded as a zip file from there, or cloned with the Git command-line tool. All of the book's examples can be compiled and run with the Leiningen build tool as described in Chapter 1, Statistics.

This book assumes that you're already able to compile and run Clojure code using Leiningen (http://leiningen.org/). Refer to Leiningen's website if you're not yet set up to do this.

In addition, the code for many of the sample chapters makes use of external datasets. Where possible, these have been included together with the sample code. Where this has not been possible, instructions for downloading the data have been provided in the sample code's README file. Bash scripts have also been provided with the relevant sample code to automate this process. These can be run directly by Linux and OS X users, as described in the relevant chapter, provided the curl, wget, tar, gzip, and unzip utilities are installed. Windows users may have to install a Linux emulator such as Cygwin (https://www.cygwin.com/) to run the scripts.

Who this book is for

This book is intended for intermediate and advanced Clojure programmers who want to build their statistical knowledge, apply machine learning algorithms, or process large amounts of data with Hadoop and Spark. Many aspiring data scientists will benefit from learning all of these skills, and Clojure for Data Science is intended to be read in order from the beginning to the end. Readers who approach the book in this way will find that each chapter builds on concepts introduced in the prior chapters.

If you're not already comfortable reading Clojure code, you're likely to find this book particularly challenging. Fortunately, there are now many excellent resources for learning Clojure and I do not attempt to replicate their work here. At the time of writing, Clojure for the Brave and True (http://www.braveclojure.com/) is a fantastic free resource for learning the language. Consult http://clojure.org/getting_started for links to many other books and online tutorials suitable for newcomers.

Conventions

In this book, you will find a number of text styles that distinguish between different kinds of information. Here are some examples of these styles and an explanation of their meaning.

Code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles are shown as follows: Each example is a function in the cljds.ch1.examples namespace that can be run.

A block of code is set as follows:

(defmulti load-data identity)

(defmethod load-data :uk [_]

(-> (io/resource UK2010.xls)

(str)

(xls/read-xls)))

When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:

(q/fill (fill-fn x y))

(q/rect x-pos y-pos x-scale y-scale))

(q/save heatmap.png))]

(q/sketch :setup setup :size size))

Any command-line input or output is written as follows:

lein run –e 1.1

New terms and important words are shown in bold. Words that you see on the screen, for example, in menus or dialog boxes, appear in the text like this: Each time the New Sample button is pressed, a pair of new samples from an exponential distribution with population means taken from the sliders are generated.

Note

Warnings or important notes appear in a box like this.

Tip

Tips and tricks appear like this.

Reader feedback

Feedback from our readers is always welcome. Let us know what you think about this book—what you liked or disliked. Reader feedback is important for us as it helps us develop titles that you will really get the most out of.

To send us general feedback, simply e-mail <[email protected]>, and mention the book's title in the subject of your message.

If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide at www.packtpub.com/authors.

Customer support

Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.

Downloading the example code

You can download the example code files from your account at http://www.packtpub.com for all the Packt Publishing books you have purchased. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.

Downloading the color images of this book

We also provide you with a PDF file that has color images of the screenshots/diagrams used in this book. The color images will help you better understand the changes in the output. You can download this file from https://www.packtpub.com/sites/default/files/downloads/Clojure_for_Data_Science_ColorImages.pdf.

Errata

Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books—maybe a mistake in the text or the code—we would be grateful if you could report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded to our website or added to any list of existing errata under the Errata section of that title.

To view the previously submitted errata, go to https://www.packtpub.com/books/content/support and enter the name of the book in the search field. The required information will appear under the Errata section.

Piracy

Piracy of copyrighted material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works in any form on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy.

Please contact us at <[email protected]> with a link to the suspected pirated material.

We appreciate your help in protecting our authors and our ability to bring you valuable content.

Questions

If you have a problem with any aspect of this book, you can contact us at <[email protected]>, and we will do our best to address the problem.

Chapter 1. Statistics

Over the course of the following ten chapters of Clojure for Data Science, we'll attempt to discover a broadly linear path through the field of data science. In fact, we'll find as we go that the path is not quite so linear, and the attentive reader ought to notice many recurring themes along the way.

Descriptive statistics concern themselves with summarizing sequences of numbers and they'll appear, to some extent, in every chapter in this book. In this chapter, we'll build foundations for what's to come by implementing functions to calculate the mean, median, variance, and standard deviation of numerical sequences in Clojure. While doing so, we'll attempt to take the fear out of interpreting mathematical formulae.

As soon as we have more than one number to analyze it becomes meaningful to ask how those numbers are distributed. You've probably already heard expressions such as long tail and the 80/20 rule. They concern the spread of numbers throughout a range. We demonstrate the value of distributions in this chapter and introduce the most useful of them all: the normal distribution.

The study of distributions is aided immensely by visualization, and for this we'll use the Clojure library Incanter. We'll show how Incanter can be used to load, transform, and visualize real data. We'll compare the results of two national elections—the 2010 United Kingdom general election and the 2011 Russian presidential election—and see how even basic analysis can provide evidence of potentially fraudulent activity.

Downloading the sample code

All of the book's sample code is available on Packt Publishing's website at http://www.packtpub.com/support or from GitHub at http://github.com/clojuredatascience. Each chapter's sample code is available in its own repository.

Note

The sample code for Chapter 1, Statistics can be downloaded from https://github.com/clojuredatascience/ch1-statistics.

Executable examples are provided regularly throughout all chapters, either to demonstrate the effect of code that has been just been explained, or to demonstrate statistical principles that have been introduced. All example function names begin with ex- and are numbered sequentially throughout each chapter. So, the first runnable example of Chapter 1, Statistics is named ex-1-1, the second is named ex-1-2, and so on.

Running the examples

Each example is a function in the cljds.ch1.examples namespace that can be run in two ways—either from the REPL or on the command line with Leiningen. If you'd like to run the examples in the REPL, you can execute:

lein repl

on the command line. By default, the REPL will open in the examples namespace. Alternatively, to run a specific numbered example, you can execute:

lein run –-example 1.1

or pass the single-letter equivalent:

lein run –e 1.1

We only assume basic command-line familiarity throughout this book. The ability to run Leiningen and shell scripts is all that's required.

Tip

If you become stuck at any point, refer to the book's wiki at http://wiki.clojuredatascience.com. The wiki will provide troubleshooting tips for known issues, including advice for running examples on a variety of platforms.

In fact, shell scripts are only used for fetching data from remote locations automatically. The book's wiki will also provide alternative instructions for those not wishing or unable to execute the shell scripts.

Downloading the data

The dataset for this chapter has been made available by the Complex Systems Research Group at the Medical University of Vienna. The analysis we'll be performing closely mirrors their research to determine the signals of systematic election fraud in the national elections of countries around the world.

Note

For more information about the research, and for links to download other datasets, visit the book's wiki or the research group's website at http://www.complex-systems.meduniwien.ac.at/elections/election.html.

Throughout this book we'll be making use of numerous datasets. Where possible, we've included the data with the example code. Where this hasn't been possible—either because of the size of the data or due to licensing constraints—we've included a script to download the data instead.

Chapter 1, Statistics is just such a chapter. If you've cloned the chapter's code and intend to follow the examples, download the data now by executing the following on the command line from within the project's directory:

script/download-data.sh

The script will download and decompress the sample data into the project's data directory.

Tip

If you have any difficulty running the download script or would like to follow manual instructions instead, visit the book's wiki at http://wiki.clojuredatascience.com for assistance.

We'll begin investigating the data in the next section.

Inspecting the data

Throughout this chapter, and for many other chapters in this book, we'll be using the Incanter library (http://incanter.org/) to load, manipulate, and display data.

Incanter is a modular suite of Clojure libraries that provides statistical computing and visualization capabilities. Modeled after the extremely popular R environment for data analysis, it brings together the power of Clojure, an interactive REPL, and a set of powerful abstractions for working with data.

Each module of Incanter focuses on a specific area of functionality. For example incanter-stats contains a suite of related functions for analyzing data and producing summary statistics, while incanter-charts provides a large number of visualization capabilities. incanter-core provides the most fundamental and generally useful functions for transforming data.

Each module can be included separately in your own code. For access to stats, charts, and Excel features, you could include the following in your project.clj:

:dependencies [[incanter/incanter-core 1.5.5]

[incanter/incanter-stats 1.5.5]

[incanter/incanter-charts 1.5.5]

[incanter/incanter-excel 1.5.5]

...]

If you don't mind including more libraries than you need, you can simply include the full Incanter distribution instead:

:dependencies [[incanter/incanter 1.5.5]

...]

At Incanter's core is the concept of a dataset—a structure of rows and columns. If you have experience with relational databases, you can think of a dataset as a table. Each column in a dataset is named, and each row in the dataset has the same number of columns as every other. There are a several ways to load data into an Incanter dataset, and which we use will depend how our data is stored:

If our data is a text file (a CSV or tab-delimited file), we can use the read-dataset function from incanter-io

If our data is an Excel file (for example, an .xls or .xlsx file), we can use the read-xls function from incanter-excel

For any other data source (an external database, website, and so on), as long as we can get our data into a Clojure data structure we can create a dataset with the dataset function in incanter-core

This chapter makes use of Excel data sources, so we'll be using read-xls. The function takes one required argument—the file to load—and an optional keyword argument specifying the sheet number or name. All of our examples have only one sheet, so we'll just provide the file argument as string:

(ns cljds.ch1.data

(:require [clojure.java.io :as io]

[incanter.core :as i]

[incanter.excel :as xls]))

In general, we will not reproduce the namespace declarations from the example code. This is both for brevity and because the required namespaces can usually be inferred by the symbol used to reference them. For example, throughout this book we will always refer to clojure.java.io as io, incanter.core as I, and incanter.excel as xls wherever they are used.

We'll be loading several data sources throughout this chapter, so we've created a multimethod called load-data in the cljds.ch1.data namespace:

(defmulti load-data identity)

(defmethod load-data :uk [_]

(-> (io/resource UK2010.xls)

(str)

(xls/read-xls)))

In the preceding code, we define the load-data multimethod that dispatches on the identity of the first argument. We also define the implementation that will be called if the first argument is :uk. Thus, a call to (load-data :uk) will return an Incanter dataset containing the UK data. Later in the chapter, we'll define additional load-data implementations for other datasets.

The first row of the UK2010.xls spreadsheet contains column names. Incanter's read-xls function will preserve these as the column names of the returned dataset. Let's begin our exploration of the data by inspecting them now—the col-names function in incanter.core returns the column names as a vector. In the following code (and throughout the book, where we use functions from the incanter.core namespace) we require it as i:

(defn ex-1-1 []

(i/col-names (load-data :uk)))

As described in running the examples earlier, functions beginning with ex- can be run on the command line with Leiningen like this:

lein run –e 1.1

The output of the preceding command should be the following Clojure vector:

[Press Association Reference Constituency Name Region Election Year Electorate Votes AC AD AGS APNI APP AWL AWP BB BCP Bean Best BGPV BIB BIC Blue BNP BP Elvis C28 Cam Soc CG Ch M Ch P CIP CITY CNPG Comm Comm L Con Cor D CPA CSP CTDP CURE D Lab D Nat DDP DUP ED EIP EPA FAWG FDP FFR Grn GSOT Hum ICHC IEAC IFED ILEU Impact Ind1 Ind2 Ind3 Ind4 Ind5 IPT ISGB ISQM IUK IVH IZB JAC Joy JP Lab Land LD Lib Libert LIND LLPB LTT MACI MCP MEDI MEP MIF MK MPEA MRLP MRP Nat Lib NCDV ND New NF NFP NICF Nobody NSPS PBP PC Pirate PNDP Poet PPBF PPE PPNV Reform Respect Rest RRG RTBP SACL Sci SDLP SEP SF SIG SJP SKGP SMA SMRA SNP Soc Soc Alt Soc Dem Soc Lab South Speaker SSP TF TOC Trust TUSC TUV UCUNF UKIP UPS UV VCCA Vote Wessex Reg WRP You Youth YRDPL]

This is a very wide dataset. The first six columns in the data file are described as follows; subsequent columns break the number of votes down by party:

Press Association Reference: This is a number identifying the constituency (voting district, represented by one MP)

Constituency Name: This is the common name given to the voting district

Region: This is the geographic region of the UK where the constituency is based

Election Year: This is the year in which the election was held

Electorate: This is the total number of people eligible to vote in the constituency

Votes: This is the total number of votes cast

Whenever we're confronted with new data, it's important to take time to understand it. In the absence of detailed data definitions, one way we could do this is to begin by validating our assumptions about the data. For example, we expect that this dataset contains information about the 2010 election so let's review the contents of the Election Year column.

Incanter provides the i/$ function (i, as before, signifying the incanter.core namespace) for selecting columns from a dataset. We'll encounter the function regularly throughout this chapter—it's Incanter's primary way of selecting columns from a variety of data representations and it provides several different arities. For now, we'll be providing just the name of the column we'd like to extract and the dataset from which to extract it:

(defn ex-1-2 []

(i/$ Election Year (load-data :uk)))

;; (2010.0 2010.0 2010.0 2010.0 2010.0 ... 2010.0 2010.0 nil)

The years are returned as a single sequence of values. The output may be hard to interpret since the dataset contains so many rows. As we'd like to know which unique values the column contains, we can use the Clojure core function distinct. One of the advantages of using Incanter is that its useful data manipulation functions augment those that Clojure already provides as shown in the following example:

(defn ex-1-3 []

(->> (load-data :uk)

(i/$ Election Year)

(distinct)))

;; (2010 nil)

The 2010 year goes a long way to confirming our expectations that this data is from 2010. The nil value is unexpected, though, and may indicate a problem with our data.

We don't yet know how many nils exist in the dataset and determining this could help us decide what to do next. A simple way of counting values such as this it to use the core library function frequencies, which returns a map of values to counts:

(defn ex-1-4 [ ]

(->> (load-data :uk)

(i/$ Election Year)

(frequencies)))

;; {2010.0 650 nil 1}

In the preceding examples, we used Clojure's thread-last macro ->> to chain a several functions together for legibility.

Tip

Along with Clojure's large core library of data manipulation functions, macros such as the one discussed earlier—including the thread-last macro ->>—are other great reasons for using Clojure to analyze data. Throughout this book, we'll see how Clojure can make even sophisticated analysis concise and comprehensible.

It wouldn't take us long to confirm that in 2010 the UK had 650 electoral districts, known as constituencies. Domain knowledge such as this is invaluable when sanity-checking new data. Thus, it's highly probable that the nil value is extraneous and can be removed. We'll see how to do this in the next section.

Data scrubbing

It is a commonly repeated statistic that at least 80 percent of a data scientist's work is data scrubbing. This is the process of detecting potentially corrupt or incorrect data and either correcting or filtering it out.

Note

Data scrubbing is one of the most important (and time-consuming) aspects of working with data. It's a key step to ensuring that subsequent analysis is performed on data that is valid, accurate, and consistent.

The nil value at the end of the election year column may indicate dirty data that ought to be removed. We've already seen that filtering columns of data can be accomplished with Incanter's i/$ function. For filtering rows of data we can use Incanter's i/query-dataset function.

We let Incanter know which rows we'd like it to filter by passing a Clojure map of column names and predicates. Only rows for which all predicates return true will be retained. For example, to select only the nil values from our dataset:

(-> (load-data :uk)

(i/query-dataset {Election Year {:$eq nil}}))

If you know SQL, you'll notice this is very similar to a WHERE clause. In fact, Incanter also provides the i/$where function, an alias to i/query-dataset that reverses the order of the arguments.

The query is a map of column names to predicates and each predicate is itself a map of operator to operand. Complex queries can be constructed by specifying multiple columns and multiple operators together. Query operators include:

:$gt greater than

:$lt less than

:$gte greater than or equal to

:$lte less than or equal to

:$eq equal to

:$ne not equal to

:$in to test for membership of a collection

:$nin to test for non-membership of a collection

:$fn a predicate function that should return a true response for rows to keep

If none of the built-in operators suffice, the last operator provides the ability to pass a custom function instead.

We'll continue to use Clojure's thread-last macro to make the code intention a little clearer, and return the row as a map of keys and values using the i/to-map function:

(defn ex-1-5 []

(->> (load-data :uk)

(i/$where {Election Year {:$eq nil}})

(i/to-map)))

;; {:ILEU nil, :TUSC nil, :Vote nil ... :IVH nil, :FFR nil}

Looking at the results carefully, it's apparent that all (but one) of the columns in this row are nil. In fact, a bit of further exploration confirms that the non-nil row is a summary total and ought to be removed from the data. We can remove the problematic row by updating the predicate map to use the :$ne operator, returning only rows where the election year is not equal to nil:

(->> (load-data :uk)

(i/$where {Election Year {:$ne nil}}))

The preceding function is one we'll almost always want to make sure we call in advance of using the data. One way of doing this is to add another implementation of our load-data multimethod, which also includes this filtering step:

(defmethod load-data :uk-scrubbed [_]

(->> (load-data :uk)

(i/$where {Election Year {:$ne nil}})))

Now with any code we write, can choose whether to refer to the :uk or :uk-scrubbed datasets.

By always loading the source file and performing our scrubbing on top, we're preserving an audit trail of the transformations we've applied. This makes it clear to us—and future readers of our code—what adjustments have been made to the source. It also means that, should we need to re-run our analysis with new source data, we may be able to just load the new file in place of the existing file.

Descriptive statistics

Descriptive statistics are numbers that are used to summarize and describe data. In the next chapter, we'll turn our attention to a more sophisticated analysis, the so-called inferential statistics, but for now we'll limit ourselves to simply describing what we can observe about the data contained in the file.

To demonstrate what we mean, let's look at the Electorate column of the data. This column lists the total number of registered voters in each constituency:

(defn ex-1-6 []

(->> (load-data :uk-scrubbed)

(i/$ Electorate)

(count)))

;;

Enjoying the preview?

Page 1 of 1

Clojure for Data Science

About this ebook

Garner Henry

Related authors

Related to Clojure for Data Science

Related ebooks

Scala Data Analysis Cookbook

Julia for Data Science

Clojure Web Development Essentials

Clojure High Performance Programming - Second Edition

Clojure Data Analysis Cookbook - Second Edition

Mastering Clojure

Learn ClojureScript: Functional programming for the web

Clojure Data Structures and Algorithms Cookbook

The Clojure Workshop: Use functional programming to build data-centric applications with Clojure and ClojureScript

Clojure Programming Cookbook

The Right to Read

Clojure for Java Developers

Clojure Reactive Programming

Learning ClojureScript

Functional Programming in Scala

Haskell Design Patterns

Natural Language Processing with Java

Xamarin in Action: Creating native cross-platform mobile apps

Synthetic data Third Edition

Clojure in Action

Mastering F#

The Golden Ticket: P, NP, and the Search for the Impossible

Learning Heroku Postgres

Hands-On Data Structures and Algorithms with Rust: Learn programming techniques to build effective, maintainable, and readable code in Rust 2018

Mastering Transformers: The Journey from BERT to Large Language Models and Stable Diffusion

Julia Cookbook

Technical Program Manager's Handbook: Empowering managers to efficiently manage technical projects and build a successful career path

Build an Orchestrator in Go (From Scratch)

Geometrical Solutions Derived from Mechanics; a Treatise of Archimedes

Scientific Computing with Scala

Enterprise Applications For You

Creating Online Courses with ChatGPT | A Step-by-Step Guide with Prompt Templates

Product Operations: How successful companies build better products at scale

Notion for Beginners: Notion for Work, Play, and Productivity

Lean Management for Beginners: Fundamentals of Lean Management for Small and Medium-Sized Enterprises - With many Practical Examples

Agile Project Management: Scrum for Beginners

Beginner's Guide to the Obsidian Note Taking App and Second Brain: Everything you Need to Know About the Obsidian Software with 70+ Screenshots to Guide you

ChatGPT Ultimate User Guide - How to Make Money Online Faster and More Precise Using AI Technology

Bitcoin For Dummies

Mastering ChatGPT: Create Highly Effective Prompts, Strategies, and Best Practices to Go From Novice to Expert

Excel for Beginners 2023: A Step-by-Step and Quick Reference Guide to Master the Fundamentals, Formulas, Functions, & Charts in Excel with Practical Examples | A Complete Excel Shortcuts Cheat Sheet

Trend Following: Learn to Make a Fortune in Both Bull and Bear Markets

Learn Power BI: A beginner's guide to developing interactive business intelligence solutions using Microsoft Power BI

Microsoft Power Platform Up and Running: Learn to Analyze Data, Create Solutions, Automate Processes, and Develop Virtual Agents with Low Code Programming (English Edition)

Low-Code/No-Code: Citizen Developers and the Surprising Future of Business Applications

Excel VBA Programming For Dummies

Change Management for Beginners: Understanding Change Processes and Actively Shaping Them

Microsoft Office 365 Bible: 10:1 Mastery | Excel in Your Profession, Enhance Time Management, and Foster Exceptional Collaboration [III EDITION]: Career Elevator

Learning Python

Excel : The Ultimate Comprehensive Step-By-Step Guide to the Basics of Excel Programming: 1

Learn MongoDB in 24 Hours

Logseq for Students: Super Powered Outliner Notebook for Learning with Confidence

React Projects: Build 12 real-world applications from scratch using React, React Native, and React 360

Learn SAP Basis in 24 Hours

Enterprise AI For Dummies

Microsoft Power Platform A Deep Dive: Dig into Power Apps, Power Automate, Power BI, and Power Virtual Agents (English Edition)

Hands-On Financial Modeling with Microsoft Excel 2019: Build practical models for forecasting, valuation, trading, and growth analysis using Excel 2019

Learn SAP MM in 24 Hours

101 Most Popular Excel Formulas: 101 Excel Series, #1

Financial Modelling in Power BI: Forecasting Business Intelligently

ChatGPT Side Hustles 2024 - Unlock the Digital Goldmine and Get AI Working for You Fast with More Than 85 Side Hustle Ideas to Boost Passive Income, Create New Cash Flow, and Get Ahead of the Curve

Related podcast episodes

Related articles

Related categories

Reviews for Clojure for Data Science

What did you think?

Book preview

Clojure for Data Science - Garner Henry

Table of Contents

Clojure for Data Science

Clojure for Data Science

Credits

About the Author

Acknowledgments