Data Science in Finance (Article) - DataCamp PDF
Data Science in Finance (Article) - DataCamp PDF
Data Science in Finance (Article) - DataCamp PDF
Hugo Bowne-Anderson
August 13th, 2018
FINANCE +4
DataFramed
#35 Data Scien… Share
Cookie policy
Here is aWant
link to
to leave a comment?
the podcast.
https://www.datacamp.com/community/blog/data-science-finance-transcript 1/23
5/18/2019 Data Science in Finance (article) - DataCamp
Hugo: It's a real pleasure to have you on the show, and I'm really excited to have you here
to talk about your work in nance, how you think about the use of Python in nance, and
the implications of all of this with respect to data science in general. But before we get
into that I just want to get a bit of context and learn a bit about you. So, maybe you can
start by telling us what you're most known for in the data community.
Yves: Yeah, I think this is an easy one. So, I'm known for Python for nance. I started using
Python more than ten years ago for nance. We started with computational nance and
people said, "Well, you can't do that. It's too slow," and all these kind of prejudices and
reasons why you can't do something, and today the biggest institutions around the world
use Python for exactly that. In particular for algo trading, so it has moved a little bit from
the comp nance side to the algo trading side, and I think people now know me for this as
well. So whenever somebody thinks of coding up their strategy and trying to deploy it
automatically for trading, when they use Python, and many people book our online
courses or listen to talks or read the books.
Hugo: Fantastic. And how did you get there? How did you get into Python, nance, and
now data science? What type of journey led you to where you are now?
Yves: Well, after our German Arbitur tour which is how we nish high school, so to say, or
in German gymnasium, I studied business administration, and I already focused on the
nance side of things. Everything that I did to earn my diploma in business administration
was centered around nance, nancial markets, banking and all these topics that interest
me from the beginning. I later on started, immediately after, started working on my PhD
thesis, which was a mathematical nance that gets more formal over time, actually. I had
a topic about dynamic hedging and the feedback in markets due to hedging of options. So
we evolved in terms of math what you need, but I really enjoyed it, the analytical
challenges and writing some theoretical stuff and using law, taking all these things.
Yves: ButWant
in the meantime,
to leave I started working as a management consultant because I
a comment?
thought well, my nance is nice and challenging, but if you later want to work in any kind
https://www.datacamp.com/community/blog/data-science-finance-transcript 2/23
5/18/2019 Data Science in Finance (article) - DataCamp
of institute, any kind of organization, corporation or what not, I've seen a little bit more. It
might be good. So I started as general management consultant working for different
companies and different types of projects, but again the focus was on nance. I didn't
leave the nancial industry. That much, yeah, after that I nished my PHD. I moved for
the rst time within Germany to Hamburg. Back then it was around the hype of the
internet startups. The web was then the hot thing and so with the company I joined we
did some consulting around web topics and so forth, but after the birth of the bubble,
there was no work anymore. So I needed to look for something else. Something else what
we did, we started our own company.
Yves: I founded my rst company in 2001. Actually pretty nice because last Friday I was at
the company for the summerfest so I left the company due to personal reasons, moved
back to my hometown in this island area due to the family, but we're still on good
speaking terms and the company is going pretty well, I must say.
Hugo: That must have been an interesting experience 17 years later being back there.
Right?
Yves: Yeah, actually the company is more than 70 consultants these days has been taken
over by a bigger large company. So I hardly knew all the people around there, but of
course the founder isn't gonna, the other ones this was pretty nice for me. Had a nice
evening there. When I moved back, this was actually the starting point where I wished for
the rst time Python for nance because I discovered it before moving back and I
founded my own company which I still own which is now the Python quants and
immediately I got started with some side projects which I couldn't pursue in the other
context with other co-founders that didn't want to do something like that. So now I've got
the freedom to pursue my own stuff and this was, among others, this was Python and we
started with them where even numpy wasn't around so we used numarray and numeric
and all this.
Hugo: Well that was going to be one of my questions. Like this is before numpy, this is
before pandas, this is before so many of the technologies that people equate with the data
science Python stack now, so it must have been a wild landscape.
Yves: Exactly, exactly. So I've done things that are now kind of like standard and that you
might teach. Had I known the third hour of training or what with pandas, we needed to
code up Want
on our
to own,
leave alike time series analysis or wanted to do something with
comment?
computation of nance and all these things, but I'm more than happy that for example,
https://www.datacamp.com/community/blog/data-science-finance-transcript 3/23
5/18/2019 Data Science in Finance (article) - DataCamp
Wes McKinney started the pandas project and have re-bought the project having grown
that much and providing us with all the nice capabilities that today use in nance. But for
sure, it was really a different landscape that many people can't imagine how it'd look like
then but I was convinced due to the beauty of the language and the whole approach that
this might be the future. And now well, this time I think I was proved right given the
success of Python in our industry and yeah, today we don't do anything else, it's all about
Python in nance and algo trade.
Hugo: Yeah, and particularly with the emergence of Python as well and Jupyter
notebooks, which back in 2001 these weren't around either, so that's very exciting that all
these developments have converged in this landscape that'll allow us to do what we do.
Yves: Yeah, it's fantastic, yes. I think from a scienti c point of view, from a developer's
point of view, data science point of view, or nancial data science if you want to put it
that way. Yeah, for sure.
Yves: When I was talking about Python being a side project, indeed it was a side project,
so we couldn't make any money out of that. But we did regular other consulting work that
I was doing before, and I think it was maybe like six years ago probably when we started
getting real traction. Maybe seven years, I think it was 2011 when we got the rst big
client in Germany, the derivatives exchange EUREX approached us, they actually, one of
the executives had seen a talk of myself at euro Python in Florence and this is how things
work and said, "Well, do you training in this regard? And can you support us?" And I said,
"Well, for sure, this is what we were waiting for!"
Yves: So this was more or less the formal starting point and yeah, it's been growing after
working with many banks, big hedge funds around the world and having hundreds of
training students, thousands of people on our online platform that use Python, and yeah,
so today we don't do anything else. People look up Python nance or whatever, I think on
Google or wherever they usually nd us, and get to our trainings, to my talks, I've given I
don't know how many talks. Well more than a hundred over the last ve years and
conferences around the world.
https://www.datacamp.com/community/blog/data-science-finance-transcript 4/23
5/18/2019 Data Science in Finance (article) - DataCamp
Hugo: For sure, so you write books, you consult, you develop training, you host meetups,
are there other- I mean, not that these aren't enough, but these are the main things,
that's where your focus lies currently?
Yves: Yeah actually, so I basically see us these days as a more or less content oriented
company. So this is what I think our core is, indeed. So writing books, of course, is about
content creation but also designing and delivering online training. Also I put in the same
category events, where of course you also need some content. We used to do also more
conference, but these days you focus for example with our partner Fitch Learning on the
bootcamp side, so this is also more about training and probably to deliver the content.
Yves: The content, of course, is at the core, and also I think skills and knowhow which is
then used, in addition to the content, where we consult clients around the world, like
we're working with a big broker in New York, Manhattan, We're working with a hedge
fund in London which of course requires certain skill set, knowhow, but this is more or
less where I'm coming from. The content side has become more and more important over
these years, in particular with our online training which is growing tremendously, which
I'm really happy about. And you mentioned the meet ups, running a few meet ups actually
in London, Berlin, Frankfurt, Paris also, one in New York, and whenever I'm there I try to
do there something as well, so it's keeping me busy.
What is a quant?
Hugo: De nitely sounds like it. So you're known as the Python Quant, you work with a
team of people and you call yourselves the Python Quants, and I think myself and our
listeners know what Python is but I don't know whether everyone has a lot of context
around what a quant is, I de nitely don't feel I have enough. So maybe you can kind of
unpack this term for us: quant.
Yves: Yeah, now if we want to get a little bit academic you would say, "Well, we have
different types of quants." But I think these days it has narrowed down a little bit. So back
in the days, when I started with the quant side of things in nance, what people typically
understood as a quant was kind of the model quant. Somebody who is sitting down and
comes up with a speci c model, for example, to price a speci c type of nancial
derivative. So a complex nancial instrument that might depend on a couple of risk
factors, like interest rates, or let's say a stock index, or maybe a basket of stocks, for
example.Want to leavesitting
So people a comment?
down and really doing research on the blackboard or these
https://www.datacamp.com/community/blog/data-science-finance-transcript 5/23
5/18/2019 Data Science in Finance (article) - DataCamp
days probably a white board or pen and paper, back in the days. A little bit maybe on the
computer but more or less to document, but these were really kind of what
mathematicians were.
Hugo: Yeah, and physicists as well, right? A lot of physicists started this.
Yves: This is what people called back then the rocket scientists. Because of the fact that
many physicists came to, well it's because of the mathematics that you need to price
nancial derivatives is pretty similar to what physicists use on a daily basis in many of
their areas. So engineers and so forth. So this was the origin so to say of the quants, the
rocket scientists, and so forth. But these days I think it's much more data driven.
Yves: So I would say these days a quant is more like a nancial data scientist because they
need to crunch huge numbers of data and I think it might differ to some extent to the
area you have a look at, for example if there is, let's say, an equity research analyst who
crunches numbers of certain companies, let's say Apple. You typically are not faced with
that amount of data. But if you have others, it would kind of like more systematic, for
example for systematic trading strategies, they might call themselves even data
engineers.
Yves: So one of the biggest hedge funds out there and most successful ones, Two Sigma,
headquartered in New York, they called these guys the data engineers. And because they
are probably just more or less independent of what the nancial professors, the people
coming up with nancial theories, say how the market should behave. They have kind of
like a pretty neutral approach in saying, "Well, let's apply whatever technique to the data
that we can get our hands on and see how we can pro t from that." So this is what I call,
in all my talks these days, data driven nance instead of kind of this equation driven
nance. Where people sit down and think of how the nancial world should behave,
they'd rather have a look at the data and try to gure out something, maybe not coming
up with this kind of fantastic nice single equation which might award you another prize in
economics some day in the future, no, but with things that might work.
Yves: A similar project every big data company out there, like all the social based
companies like Twitter or explaining how they come with their recommendations is
simply have a look at the data for the recommender engines, use machine learning
approaches and recommend a song number three when people usually hear three in
combination
Want with one
to leave and two, and this is what these data engineers, these quants, do
a comment?
these days as well. So having a data driven mindset and say, "Well, let's have a look at the
https://www.datacamp.com/community/blog/data-science-finance-transcript 6/23
5/18/2019 Data Science in Finance (article) - DataCamp
data and apply whatever technique might bring us something in this regard." So people
working with quantitative things, usually numerical data and more and more
unstructured text based data, this is what the quants do these days. So we might have still
a few handful of model quants who I started with, but it's just a handful compared to
thousand others that work with nancial data in this area.
Hugo: Okay, and so to reframe that slightly, one of the ideas there is that in the large data
limit, theory laden models may be unnecessary, essentially.
Yves: Essentially yes because when you have a look at the history, people have been pretty
successful with coming up with nice models when they've made, let's say, appropriate
assumptions in the form typically of normally distributed returns and linear relationships.
But this is already where all these theories are doomed to fail, because we don't normally
have distributed returns in markets. Across all market classes you can analyze the returns
in the markets, they are not normally distributed, in general the relationships that you
face are nonlinear and therefore having a different look at the nonlinear changing world,
with different algorithms might prove more fruitful than to rely on the old theories from
the 50's, 60's, 70's, 80's that are still in use today.
Hugo: Yeah, absolutely. So in general, what are the major subdisciplines of nance that
you think data science is and can have a large impact in?
Yves: So I must say, I'm in general only on the investment and corporate banking side, so
to say. I hardly have any point of contact with the retail side. So most people, everybody
who listens to what we're talking about, do of course their nancial stuff maybe on a daily
basis with their apps on their phones or maybe online banking on the web. But this is not
part I'm involved in typically. There, I think there are tremendous success stories of kind
of like data science. For example in credit lending, which is mostly automated these days
using machine learning algorithms, like scoring there and so forth. But again, this is not
my eld of expertise.
Yves: So what I'm mostly concerned with is the corporate investment banking side and
there in particular, to derivatives and the trading side. And in general, this is how I think
of this little world. Smaller than the world of others, but it's nancial data science, so
whenever you need to crunch the petabytes of the internet that are available these days,
Want to leave
it is to nancial data ascience
comment?
and we might nd we discussed this with regards to quants,
https://www.datacamp.com/community/blog/data-science-finance-transcript 7/23
5/18/2019 Data Science in Finance (article) - DataCamp
ho want ve people in different areas of bank, for example, hedge fund the world. Other
buy side company, like an asset management company that are concerned with nancial
data science, on a smaller in general these days on a larger scale. So crunching any type
of nancial data, market data, unstructured data like news data and so forth. Then we
have the trading side of things where people try to come up with algorithms and there
are different types of algorithms. And more and more we see people trying to apply AI
based algorithms, contrary, let's say for example, to some deterministic algorithms that
you need to execute larger trades, they try to come up with some machine learning AI
based algorithms to predict markets and if they are well enough in predicting markets
that might bene t also.
Yves: So we recently saw it's such an endeavor for ourselves as well. Then of course,
computational nance, which includes areas such as derivatives and options pricing. This
also includes risk management. For example, risk management is still a big topic. So when
you have a big investment bank that is sitting on, let's say a million plus derivatives
positions overnight, and one of the major tasks overnight is to come up with some risk
numbers for the complete portfolio that they face. So maybe people that are lled about
value at risk or credit value at risk and it's really computationally demanding and such
jobs are running on huge clusters with thousands of computers overnight to crunch the
numbers in an appropriate way for the bank to get a, let's say, a decent view on the risk
position. So computational nance usually is kind of the most demanding in this eld.
Yves: And this is I mentioned before we got started on a small scale, but these days for
example, the biggest banks in the world, like Bank of America, Merrill Lynch for their
trading and risk management platform, they mainly use Python for example, as the
implementation language, although the hardcore calculations are still done in C++. Not so
much since Python is too slow, but they started developing the pricing libraries like
decades ago. And back then there was no way around C++ in this computationally
demanding area. So nancial data science, Algo trading, competition of nance are at
least our areas where we focus on and apply data science techniques in the nancial eld.
Hugo: Fantastic. Could you just slightly unpack the difference between nancial data
science and computational nance a bit more?
Yves: Yeah. Typically what to do in data science is that you have a look at the data that is
Want to leave a comment?
there, meaning historical data, be it on a simple level, end of day data of what apple stock
https://www.datacamp.com/community/blog/data-science-finance-transcript 8/23
5/18/2019 Data Science in Finance (article) - DataCamp
over 10 years, then you have probably some 252 data points per year. After 10 years you
have 2,500 data points. So this is not really challenging these days as we know, but this is
basically where every nancial theories based on. But rather when you have a look at the
apple tick data which is submitted and provided by NASA or data providers such as
Bloomberg and Thomson Reuters with which we work, Then you might get some 2000
points per quarter of an hour or 8,000 per hour So this is then where we get to bigger
data and people need some different techniques and let's say, an Excel spreadsheet for
example, to work with such data. Yves: But no matter what, it's typically historical data
and you might try to come up with some predictions, some forward looking numbers or
whatever based on the historical data. Computational nance, it's more or less with
regard to the areas that I've been describing, like risk management over time is based
typically on Monte Carlo simulation, which is by de nition a forward looking technique.
So while I might have a look backwards, 10 years in Apple's stock in computation of
nance when I want to price the derivative, I have a look forward, let's say over three
months or 12 months or two years, three years, and try to simulate the markets and model
correlations between different risk factors to come up with a somehow good
understanding of what the future might look like in terms of market prices and other
relevant quantities. So my thinking is that sends that data science, we look at the existing
data and tries to come up with certain points to predict, but computational nance in and
of itself has a forward looking element that is dominant and they're trying to better
understand what the future might look like, not coming up with a single. Let's say
forecast for the apple's stock price in 12 months? No. Rather with the distribution of
possible apple stock prices in 12 months based on 100,000 or 500,000 simulations of the
apple stock price.
Hugo: Fantastic. Now there are two terms that you've used that I just wanted you to
explain brie y for me and the listeners, derivatives and options because none of us
necessarily know much about nance.
Yves: Yeah, sure, sure, sure. These are ... Actually they are involved and typically as, let's
say general investor or when you say you want to save for retirement, you typically don't
get in touch with these instruments, but they are used in many different areas. They are
rst used to do some risk management. You can use derivatives in general, options is a
subclass of derivatives to do risk management. You can also use to speculate and so forth,
but basically what they typically are is, this where the name comes from.
Want to leave a comment?
https://www.datacamp.com/community/blog/data-science-finance-transcript 9/23
5/18/2019 Data Science in Finance (article) - DataCamp
Yves: Their price is derived from another nancial instrument, so for example, the apple
stock is traded on Nasdaq and you can buy it and price might go up or down. This is a
straightforward thing, but there are options traded on things like apple stocks or on the
S&P 500 or on other instruments to this end, that derive their price directly from what
the underlying. This is how it's called and in this case the apple stock is doing and the
option, for example, a call option for example, would represent the right to buy the apple
stock at some certain point in the future at a predetermined price.
Yves: So, options in that sense typically represent some rights which you can exercise, but
typically you are not required to exercise them. So this is what the optionality comes
from. The other derivatives like futures that are unconditional. So you buy them the price
is also derived from something else, but this is more or less live or die. Once in your in,
you can only sell this thing, but with the option you have the right to buy something at a
predetermined price at the future of predetermined date or over certain period of time or
to sell it that we speak of a put option, put options to sell, call options to call. And the
pricing of these instruments might get really tricky and involved and they need advanced
nancial mathematics in order to come up with a proper price. The pricing of Apple Stock
is actually pretty simple for a single trader.
Yves: You simply open your browser and you look at the price and it's there so, but to
price derivatives, it's not straightforward. So this is where many people and many books
have been written regarding this topic.
Hugo: For sure. Great. That makes perfect sense. Now the other thing you mentioned, a
couple of times when talking about the major subdisciplines of nance that data science
is having an impact in, you talked about machine learning and arti cial intelligence, so I
was wondering what you see the role of these two coupled technologies and ways of
thinking about modeling the world. What impact they're actually having or whether we
have a healthy skepticism as well concerning things that are buzz terms as well as things
that provide a lot of value. So how does this apply in nance?
Yves: Well actually I'm getting to the trading side of things. Just today a book arrived
which is called Pattern Trading. So I discovered this in a magazine on the weekend. I said,
"Well let's have a look at this book. Looks pretty nice." and already the name suggests:
Pattern Trading means trading based on some patterns or price formations that you see
with regard to nancial instrument. So for example, the apple stock or this can be, let's
say the gold
Wantprize or aitcomment?
to leave can be the Euro, US Dollar exchange rate. The theory goes that
https://www.datacamp.com/community/blog/data-science-finance-transcript 10/23
5/18/2019 Data Science in Finance (article) - DataCamp
when you see certain patterns in the prices, this might in one case signal further upwards
movement or another case downwards movement or that the market is most probably to
move sidewards. But again, it's all based on patterns and I think most of the listeners are
completely aware of the fact that machine learning techniques are pretty good in learning
about, let's say the value of patterns.
Yves: First of all, recognizing patterns and second of all, coming up with predictions based
on patterns. Make sample from before Spotify. When people listen to a song, ABCDEFG
than the fantastic new song might be something for you, because you have also been
listening to Song ABCDEFG. And this is the same with patterns. If you see patterns in
markets then you might say, "Well, the scale was, let's say up, down, up, down." Then the
machine might learn that with a high likelihood it's more probable that the market goes
down afterwards or up afterwards. So this is what the machines of course should do, they
should learn what is happening. When I give talks, I typically have a couple of pictures
showing patterns and people starting to nod and say "Yeah, I know this one, I know this
one, I've been trading on that one." But my argument is, and you have been asking me
about how machine learning, deep learning, all these tactics might in uence markets.
Yves: I'm saying usually, I'm not saying that there is nothing in these patterns, nor that
there is something in these patterns. What I'm saying to people is that, if there is
something in these patterns, then for sure machines are better at recognizing these
patterns and learning these patterns and then at executing trades based on these
patterns. Because I mean, we all know maybe a human being might learn over the course
of his trading lifetime, I can know 20, 40 patterns maybe, machine doesn't have any issue
in learning patterns which are pretty complex. Let's say 100 based on 100 features for
example, and maybe a hundred 50,000 relevant patterns that it immediately recognizes.
And of course, when we are about trading, it's about seizing opportunities. The fact that
you can trade on what you see, the better it usually is, and the less emotionally you are,
the better it usually is.
Yves: And this is, I think, what the advantages of the machines are compared to human
beings. I think we are not yet there that in every single area the machines and the
algorithms will replace the traders. But there are good examples, like, Goldman Sachs.
Always read this quote I'm using and also from The Economist for example, that in 2000
Goldman Sachs had 600 equity traders on the single trading oor and of the 600 there
are just two left, just two people and the rest is kind of replaced by technology. And of
Want to leave a comment?
https://www.datacamp.com/community/blog/data-science-finance-transcript 11/23
5/18/2019 Data Science in Finance (article) - DataCamp
course technology, it doesn't build itself. So let's say the human resources have been
replaced from the trading skill to the technological skill.
Yves: I mean, you need people of course, who built the systems and so forth. So when
people say, "Well, this job is about to be replaced by machines." But there must be people
who built the machines and who built to write the software, that really replaced the
people. And this is what we see in the nancial markets, I think in spectacular fashion that
they are looking for more and more technologists, data scientists, programmers that are
able to build the machines that you acquire these days in order to be successful in
markets, but other stuff that has been done manually in the recent past is not en vogue
anymore. And even high paying jobs are suffering in this regard as the equity traders that
I mentioned here in the Goldman Sachs example.
Yves: Yeah, I mean this is exciting. This describes it and it's already quite a while ago
when this statement was made actually and what we see now, I mean this is usually when
you have something new then people try to rush in one particular direction. But I think
now we're getting back to a point we'll say, ""Well we might need the market savvy people
still." You know, when you hire somebody who is pretty good at programming but has no
experience in markets, what would you expect from these people to program into the
trading applications in terms of risk and their safety measures and so forth. A little bit of
understanding is simply required in that sense, and what we see and think this will be
kind of the near future at least is that, these days they try to merge the worlds in the
sense of that people doing, let's say simple or rather from a data point of view, simple
Want to leave a comment?
equity research.
https://www.datacamp.com/community/blog/data-science-finance-transcript 12/23
5/18/2019 Data Science in Finance (article) - DataCamp
Yves: They start using our fantastic technologies like pandas for data crunching, a
visualization with, let's say, I didn't know with plotly and all these things that we use on a
daily basis to become better at their jobs and maybe at the same time accomplish more or
being able to crunch ever increasing amount of data and also the traders, I mean, history
was kind of like in a way that there was one trader, maybe two people on the left and the
right hand side of traders. They were in real time programming the excel spreadsheet
applications. If the trader had a new idea, so these days the trader and people were
responsible. There are probably quiet come up with their own solutions that use different
techniques than excel spreadsheet where some people are sitting on the left and the right
and doing kind of like real time tweaks while the trader is trading.
Yves: So this is one example that I recently retweeted from. The Fortune World said,
"Well, forget what's choosing the language Citigroup, once it's incoming investment bank
analysts to knows Python." So even in a eld like investment banking or let's say people
working for consulting companies, they are these days required to have some
programming knowledge. This hasn't been the case before which M&A banker used
Python like 10 years ago or ve. No body there, but these days people are instead of
expected to know a little bit about Excel, they are just because in every eld they are
facing these huge amounts of data and people now know that, it's much more ef cient to
crunch these numbers and data by technology such as Python. They are expected to
know a little bit about programming as well. And I wouldn't say that everybody these days
should become a software developer, architect or engineer, but you know, kind of like
with a little bit of training, you can accomplish quite a lot compared to the traditional
approaches in this eld. So more like emerging like hybrids. I think hybrid is kind of a
trendy word anyway. So like the hybrid skill set that is required, market knowledge, your
background, maybe more banking side, market side or which department do you belong
to, but nevertheless to know coding afterwards. It's a little bit like math, you know, math
never hurts. And I think to know about programming and data science doesn't have either
these days. So English, math and Python. These are, I think the three most important
language that anybody should master before getting a job or changing jobs.
Hugo: Exactly. And that actually reminds me that I've heard you say when asked why
Python for nance so much, you've spoken to the fact that you can see a Python to be the
English of the data and nancial world. And I'm wondering why that's the case.
Yves: Sure. Many people say, "Well, you know, this language has this fantastic feature" or
Want toor
"Julia is faster" leave
thisa and
comment?
that. I think most people, and just having talked about the
https://www.datacamp.com/community/blog/data-science-finance-transcript 13/23
5/18/2019 Data Science in Finance (article) - DataCamp
investment bank analysts that are supposed to learn Python, how many languages should
such a person in their expected capacity learn? I mean, it's hard to master a single
language properly. So, and I'm coming more from a time constraint, resource constraint
point of view where I say, "Well, if you only have time to learn a single programming
language, then it should be Python." If you only have time to learn one foreign language,
spoken language for almost everybody around the world, it is these days English. And this
is where I see kind of the parallels and say, "well, not too many people easily learn three,
four languages. Neither spoken language programming language."
Why Python?
Yves: Python, why Python? I mean, of course it's the proper one. When I started back in
the days, this was for me, the rst proper contact, I must say with a scripting language,
which on the high level allowed. Yeah, fast into activity. Even back then even you without
IPython you could do on it, Michelle, amazing things and so forth. So I wasn't used to that.
So when I grew up I started actually coding Assembly & basic on a commodore C64, you
know, this is where I came from.
Yves: Then I did C at university compile cycles and so forth, and then for a couple of this
fantastic scripting language, and this was on top of the interactivity, it was so close to
mathematical language. So when you have the nancial theory, an equation or whatever,
you have few equations there, it's pretty straightforward and you without that much of a
train, you can translate what the math, the nance says to Python, this is what got me
hooked there.
Yves: I think this is not the major argument these days anymore. From my point of view,
Python is kind of the orchestrating language for all the technologies that you need there. I
think is the best language to use for data science as rst class citizen in the world. When
people these days of tensor ow, they use Python as their interface. We have the fantastic
scikit-learn package and many, many others. I can't even get to some what's
comprehensive list in this regard. And I think this is what actually differentiates Python
compared to all the other competitors. Like let's start with the established ones in our
eld, C++, Java or with C# or with Julia, which is a typical competitor and not in terms of
numbers but in terms of being pretty close with regard to the syntax and their approach
and so forth, but the ecosystem is missing.
Want to leave a comment?
https://www.datacamp.com/community/blog/data-science-finance-transcript 14/23
5/18/2019 Data Science in Finance (article) - DataCamp
Yves: And I think the ecosystem is what makes Python unique. And today, everybody who
wants to enter the nancial eld, I think on top of all my arguments which might be
subjective, is simply almost every institution has chosen Python as the core language. So
if you have some kind of career at the time of year in our eld, Python is simply a good
thing to have on your CV because you might prefer some other more exotic, maybe faster
or whatever language, but if your potential employer doesn't use this language, probably
you won't get too many plus points on your evaluation afterwards. So this is more of the
career aspect that if you know Python, you can work in many, many elds and many
companies these days.
Hugo: Exactly. Having Python on your CV and resume is incredibly important, but also
now having Python in your github repository is also a huge step in the interview process
as well. Right? When when applying for jobs in nance.
Yves: Sure, this one of the fantastic aspects of living in open source age that you really can
showcase what you've been working on. Of course, most people having kind of a
professional job, they are usually not allowed into talk about what they're doing. The
nancial industry is really secretive in this regards and this is why the industry loses
many fantastic people we might say to the more open companies, but due to regulation,
legal and all these things, people are hardly ever allowed to really talk about what they are
doing. But of course you can do stuff on the side and you can build your programming or
data science CV easily on platforms such as GitHub. On the other hand, of course these
days, these are the side effects.
Yves: There are some companies specialized crawling pages like Github or platforms like
GitHub and looking for people, so replacing maybe the search on Linkedin or via other
means for talent in this regard? So every once in awhile I get approach there as well. If I
would be interested in changing jobs, maybe they should improve their research when
they would have seen that I'm running my own company, maybe they wouldn't even ask.
But for other people, this might represent an opportunity to present themselves to
showcase what they can do and also to learn to get feedback from the community. If
they're working on something of interest to others, they might even win over
contributors or get feedback and that's maybe some fame even in the community.
Hugo: For sure. And something we've really been talking around in the emergence of
Python as such an important tool in nance and data science, in the use of GitHub and
Jupyter notebooks,
Want to leavewe've actually seen a huge shift in the past several decades from the
a comment?
https://www.datacamp.com/community/blog/data-science-finance-transcript 15/23
5/18/2019 Data Science in Finance (article) - DataCamp
use across the board, not only in nance but academic research and all types of
quantitative, computational disciplines, a move from proprietary tools to open source
tools, right?
Yves: Yeah, sure. I mean this is sorts of one of the bene ts if you have a huge community
behind it, I think hardly any commercial company can keep up because such community
to an effort. If you have millions of users of Jupyter notebooks and not only for Python, of
course we have multiple other kernels that can use those or Julia kernels and use the
same environment. This is fantastic and the project has been growing tremendously. I'm
really happy about it. Also for us providing content is that we are more or less a content
creation company as a fantastic means of grading content and sharing it. For example, I'm
currently writing on a second addition of my Python for nance book. Basically all Jupyter
notebook based, all the codes that Jupyter notebook and I have one of my books was
actually written 100 percent in Jupyter notebooks. And then I programmed a little
work ow behind it, translated it into LateX and nally Wiley published what ... I got all
started down in Jupyter notebooks.
Yves: So many, many things that you can do also with regard to what I'm a big fan of:
reproducible research. So when I was growing up, doing my research, you just had back
then mostly printed research papers and people presented just the result, but you never
could somehow get to the point where I said, "Well, what data have they been working
with. How did they crunch the data? Are these results that they presented reliable and so
forth." And so they with access to many open data sources and providing, for example,
Jupyter notebooks, not only with text but with the code that crunches the data and
presents the results, it's fantastic when you're after reproducible research again, which
I'm a big fan of.
Yves: Python for sure. I mean, if you believe in kind of like polls and overviews and surveys
and so forth, people typically says that the most common combination is Python, R and
SQL. So you need somehow some databases, I think we have multiple other options these
days, likeWant to leave
HDF5, a comment?
we don't need SQL but Python for sure. Maybe some R here there if you
https://www.datacamp.com/community/blog/data-science-finance-transcript 16/23
5/18/2019 Data Science in Finance (article) - DataCamp
don't nd, say a statistical package in the Python Universe, but this has hardly ever been
the case for what we have been doing for ourselves and also for clients. Then of course,
machine learning AI in general to know about the basics of statistics, to know about the
algorithms, what unsupervised learning and supervised learning all the techniques there,
logistic regression, Gaussian, naive Bayes, whatever jumps to mind. Pretty sure you don't
need to be an expert in every single area of this eld.
Yves: So knowing all the theories, more like from an applied perspective to know what is
there and to know what to apply when given certain datasets. Then actually the points
which I usually summarize on the basic, so we have, for example, in or certi cate program
which is the largest, longest running online training program that we have, a complete
area tools and skills, where we teach basic tools and skills that people from our point of
view to know at least a little bit about like using editors, processes, setting up
environments, deploying stuff in the cloud, working a little bit with docker containers and
of course along the topics that I've mentioned right now, they probably require complete
study over the years, but to know the basics typically help you pretty much like the basics
of Linux, a few command line tools, some dev tools, Python packaging, publishing, testing,
all of these topics, software engineering and basics in general. Not being an expert maybe
after that, but I think rst 10% happier already with the 70% your problems.
Yves: Then about data storage, working with data is important of course, particularly in
the nancial eld, thinking of trading, if you implement backtesting programs and you'd
be able to fast work with huge amounts of data, to crunch the data, to store it correctly,
to store your results and all these things, but also for example, to work with streaming
data, which is actually pretty important in our case. Not In every area. You have the need
to process data return, but in nance generally, this is the case that, for example, in
trading when you do algo trading, you need to be able to digest tick data, streaming data
in real time to crunch it, to sample it, to come up with the signals in real time based on
your trained models and to act in real time.
Yves: So this is from experience something that people are at least having a hard time
getting started with, but it's simply required because otherwise how do you want to
automate things with regard to algo trading, if you don't know how to work with sockets
and streaming and maybe also even streaming visualization for example, to implement
your trading strategies to keep with this example, but I think these are kind of like from
my point of view, the skills and technologies that people should know about it.
Want to leave a comment?
https://www.datacamp.com/community/blog/data-science-finance-transcript 17/23
5/18/2019 Data Science in Finance (article) - DataCamp
Hugo: Great. That's very useful. I think it isn't as though you just need to learn all of these
things straight away, right? I mean essentially you can learn them on the go in a project
based way.
Yves: Yeah, sure. This is what we usually tell people as well that we say, well, we provide
you with a basic overview, maybe skip some details in the beginning. So with our program
we have let's say a 12 week structured schedule, so to say where we try to cover the bases,
but afterwards it's more practical things, practical modules and people are supposed to,
for example, to do a nal project and there they can select a topic, but they are then
expected to apply it to different things and this is from experience where I really learn
about this stuff. Maybe you know about it after the formal education part, but you'll learn
about it once he gets started applying it.
Yves: This is for people writing me back and say, "Well I couldn't imagine that one day,
we're sitting here spinning up cloud instances and deploying Python code in the cloud
and doing remote monitoring based on socket streaming and so forth. Now it feels like
second nature." But I'm pretty sure when these people for the rst time saw what this is
all about is, "Oh, I will never master that." So this is kind of a natural reaction, but of
course applying it to something that interests you when you have a purpose. This is when
you get started learning about stuff and learning to master it.
Hugo: Exactly. you've answered this, but I'm an expert and really was, Where would you
suggest beginners who want to work in Financial Data Science look?
Yves: I can hardly say anything else because it's at the core of our business that we
provide online trainings, also live trainings, in and the form of Bootcamps ]in London and
New York usually. But if you carefully look at our pages, we have kind of broad offering
which has been growing over the years and of course in uence or we have been done
over more than 10 years. I've been working for the biggest hedge funds for big banks and
other nancial institutions in this regard. We know what is kind of expected and this is on
one of our subpages training.tpq.io and the certi cate that I mentioned before, this is
really our agship offering where we have a 16 week program after which we hope that
you're able to use Python for algo trading or for other nancial things. But algo trading is
the focus and we even are able to award a German university certi cate because we are
cooperating here with our local university. If you're doing a masters program within
Europe, this is even good for the super ve CTS funds European credit transfer system
might beWant
interesting,
to leave athat only of course for current students or future students in this
comment?
https://www.datacamp.com/community/blog/data-science-finance-transcript 18/23
5/18/2019 Data Science in Finance (article) - DataCamp
regard. But it's a formal certi cate from university for Python and Algo trading or nance
depending on you.
Hugo: Fantastic. We'll make sure to link to those in the show notes as well.
Hugo: So in general, what does the future of data science in nance look like to you?
Yves: This from my point of view an easy one. They decide will become core discipline of
nance. I said it before. We come from an area where the brains and math equations have
been driving nance and this regard, they don't show them nance is what we replace it,
so we might lose quite a bit of beauty and nancial equations and modeling and so forth,
but on the other hand we might get back what I like to call the scienti c method in this
context that we start with the data, we have a deep look at the data in any area, every
area that we have in the nancial industry and apply the new algorithms, and in that
sense I think nance will be more driven by stuff that is developed outside of our industry
than ever before. So meaning that of course these nice theories that are still around and
are applied and some of them successful, others not that much, they usually came from
nance professors or nance practitioners.
Yves: But now people start using stuff maybe developed within let's say, Google, where
the major point was to have a good algorithm to play go or to build a self driving car so
completely different background. But in that sense, the background can easily be changed
to nancial background with some adjustments and in this regard, I'm happy that we have
since March this year, the rst proper book about Financial Machine Learning because
many algorithms can easily be transferred, but we have some speci c to consider when
we apply the algorithms with regard to the data, how we crunch the data, how we manage
the data, what are kind of special things. When we look at the nancial time series in
contrast to let's say a physical time series of whatever kind, and in that sense it will
become place not something special. It will become a core thing as you say, "Well, what do
you think of technology in the banking industry?" Many people say banks are essentially
technology companies, and I think the data-driven AI- rst nance future is not too far
away from my point of view.
https://www.datacamp.com/community/blog/data-science-finance-transcript 19/23
5/18/2019 Data Science in Finance (article) - DataCamp
Hugo: Awesome. So a question that popped into my head during that was, given that data
science and data literacy are such important fundamental skills in nance now and will be
more so, how long would it take for a working data scientist to get up and running to
work in nance? Like someone who knows the scienti c Python stack for example?
Yves: It depends a little bit on the area. I mean, the more quant oriented it is, the more
sophisticated the nancial stuff is that you are faced with, I think the longer it might take.
There are some areas, again, on the retail side, which I'm not that much concerned with,
where people might nd it pretty easy because on the retail side, facing many consumers,
this is similar to what people do on social media or the recommender engines that I
mentioned a couple of times, I think there it's pretty straightforward to get up and
running immediately.
Yves: But in other elds, depending on what you do, for example, getting back to the
computational nance derivatives pricing, there are now a couple of research papers and
last year I gave also one day elective, but the potential of applying machine learning
techniques in the eld of derivatives pricing and quantitative nance in general, so they
of course you need to have the background for what has been there before and what kind
of the basic rules are in order to apply this stuff and there I think it might take a little
while to get up and running, but in the end it all boils down to how good, how well is your
math background, what is your programming data science background.
Yves: English of course is something I would assume is there and with these three basic
skills, then to learn the nancial lingo and so forth. This is then usually the easy step, but
the math in some areas, you simply need to have in order to get up and running with the
stuff that you're supposed to do that.
Call to Action
Hugo: So my nal question is, do you have a nal call to action for all our listeners out
there?
Yves: Yeah, sure. I mean, focusing again on the Python for nance side and it started
some let's say three years ago when people reach out and say, "well, I'm interested in
machine. I want to apply it and nance, this and that background, want to make a move."
And today when people try and still today, I must say, when people try to get into the eld
and to apply
Wantdata science
to leave and machine learning and try to pro t, be it within a corporate
a comment?
https://www.datacamp.com/community/blog/data-science-finance-transcript 20/23
5/18/2019 Data Science in Finance (article) - DataCamp
context or as a retail algo trader, let's say, trading their own cash positions there, it's still
hard to get up and running. I can only recommend you to experience and since we're
doing it to look for a kind of an appropriate integrated training program. You have that
many things for free all day. You have that many university based things that might be
remotely related to what you're looking for. But many people told me and con rmed that
they have wasted months and months trying to look for that stuff on their own. So we
have done a living out of putting this all together into integrated training programs
documentation, for example, with our program, it gets a 1200 page documentation only
about Python for nance and Algo Trading, so it's quite a bit. And once you have found
something like this, is de nitely to have to do with what we do. Not everybody's
interested in Algo trading but something like this where you'll say, "Well this is a good
study where this provides her with the product overview, but also with the details that I
need to get started.
Yves: Then you should focus on the basics and this is where many people in our program,
I wouldn't say really complained, but when they say, "Well, I'm having a really hard time
and is with math learning the basics." So for example, for the rst time people may be just
having a window spectrum, having used the regular tools there and so forth, setting up a
droplet, moving around in a Linux environment and so forth, might see kind of a really
dif cult thing. But I mentioned it before, some people after a few weeks, now it feels like
second nature, this means you can get to the point where you know how to use vim via
SSH access to a cloud instance. You deploy your automated code and work with sockets
and whatever. So master the basics and look at the nice tools that are there in the Linux
world and master the processes there.
Yves: I think then once you get around the training, you have the basics ready, be it more
on the nance side, more on the programming side. You should get started with
implementing many little projects. So that interests you, then this is what we typically do
as training samples that you might say, "Well I want to have a little app where I simply put
in let's say a stock symbol and my ask app that I host on a digital ocean droplet, then
shows me a craft with some simple moving average. So for somebody who knows what
this is, this might be just takes less than an hour, if you know what to do, but if it gets
started with these things, this might take like a week or two, but to have these projects
where you have indeed at the end result that you might even showcase your colleagues,
friends that you might put on guitar or whatever this is, I think, where the learning curve
gets really steep. Towards the end, I'm pretty sure everybody who wants to enter this
Want to leave a comment?
eld and wants to learn about all these fantastic technologies and the challenges in the
https://www.datacamp.com/community/blog/data-science-finance-transcript 21/23
5/18/2019 Data Science in Finance (article) - DataCamp
eld. They might have something in their head like building their own algorithmic trading
operation or coming up with their own derivatives pricer or coming up with a machine
learning application for portfolio management for example. Then they should, after that,
they should come up with such a project, scope it specify it and get started and then work
over weeks or even months and build something huge. I think this is then where you
collect on the way all the pieces that might have been missing before because you have
been really focusing on a few things, but once it gets to the point where your master such
a huge project, I think we can say, "Now I'm pro cient and work in the industry, I'll get my
major project up and running."
Yves: Perfect.
Hugo: Fantastic. Yves, it has been such a pleasure having you on the show.
9 2
COMMENTS
Raj Shivakoti
22/03/2019 01:07 PM
Hello
It was so interesting to see how well you express your view with the help of such an
interesting Blog. The information you provide was really helpful to me in understanding the
application of Data Science with regards to Finance. It motivate to to Pursue Data Science
Course as it has lots of future scope. I am looking forward to see more of your blogs on similar
topic.
Thank You
Want
1 toRleave
E P LYa comment?
https://www.datacamp.com/community/blog/data-science-finance-transcript 22/23
5/18/2019 Data Science in Finance (article) - DataCamp
ricky rajat
22/03/2019 07:21 PM
I am already a user of Datacamp and did the python for Finance course. i am very much
interested to pursue a degree in it. is there any institute which provide degree in data science
for Finance?
1 R E P LY
Subscribe to RSS
https://www.datacamp.com/community/blog/data-science-finance-transcript 23/23