R: Data Analysis and Visualization

Ebook
1783
Pages

About this ebook

Master the art of building analytical models using RAbout This BookLoad, wrangle, and analyze your data using the world's most powerful statistical programming languageBuild and customize publication-quality visualizations of powerful and stunning R graphsDevelop key skills and techniques with R to create and customize data mining algorithmsUse R to optimize your trading strategy and build up your own risk management systemDiscover how to build machine learning algorithms, prepare data, and dig deep into data prediction techniques with RWho This Book Is For

This course is for data scientist or quantitative analyst who are looking at learning R and take advantage of its powerful analytical design framework. It's a seamless journey in becoming a full-stack R developer.

What You Will LearnDescribe and visualize the behavior of data and relationships between dataGain a thorough understanding of statistical reasoning and samplingHandle missing data gracefully using multiple imputationCreate diverse types of bar charts using the default R functionsFamiliarize yourself with algorithms written in R for spatial data mining, text mining, and so onUnderstand relationships between market factors and their impact on your portfolioHarness the power of R to build machine learning algorithms with real-world data science applicationsLearn specialized machine learning techniques for text mining, big data, and moreIn Detail

The R learning path created for you has five connected modules, which are a mini-course in their own right. As you complete each one, you'll have gained key skills and be ready for the material in the next module!

This course begins by looking at the Data Analysis with R module. This will help you navigate the R environment. You'll gain a thorough understanding of statistical reasoning and sampling. Finally, you'll be able to put best practices into effect to make your job easier and facilitate reproducibility.

The second place to explore is R Graphs, which will help you leverage powerful default R graphics and utilize advanced graphics systems such as lattice and ggplot2, the grammar of graphics. You'll learn how to produce, customize, and publish advanced visualizations using this popular and powerful framework.

With the third module, Learning Data Mining with R, you will learn how to manipulate data with R using code snippets and be introduced to mining frequent patterns, association, and correlations while working with R programs.

The Mastering R for Quantitative Finance module pragmatically introduces both the quantitative finance concepts and their modeling in R, enabling you to build a tailor-made trading system on your own. By the end of the module, you will be well-versed with various financial techniques using R and will be able to place good bets while making financial decisions.

Finally, we'll look at the Machine Learning with R module. With this module, you'll discover all the analytical tools you need to gain insights from complex data and learn how to choose the correct algorithm for your specific needs. You'll also learn to apply machine learning methods to deal with common tasks, including classification, prediction, forecasting, and so on.

Style and approach

Learn data analysis, data visualization techniques, data mining, and machine learning all using R and also learn to build models in quantitative finance using this powerful language.

About the author

Tony Fischetti is a data scientist at College Factual, where he gets to use R everyday to build personalized rankings and recommender systems. He graduated in cognitive science from Rensselaer Polytechnic Institute, and his thesis was strongly focused on using statistics to study visual short-term memory. Tony enjoys writing and contributing to open source software, blogging at http://www.onthelambda.com, writing about himself in third person, and sharing his knowledge using simple, approachable language and engaging examples. The more traditionally exciting of his daily activities include listening to records, playing the guitar and bass (poorly), weight training, and helping others.

Brett Lantz has spent more than 10 years using innovative data methods to understand human behavior. A trained sociologist, he was first enchanted by machine learning while studying a large database of teenagers' social networking website profiles. Since then, Brett has worked on interdisciplinary studies of cellular telephone calls, medical billing data, and philanthropic activity, among others. When not spending time with family, following college sports, or being entertained by his dachshunds, he maintains http://dataspelunking.com/, a website dedicated to sharing knowledge about the search for insight in data.

Jaynal Abedin currently holds the position of senior statistician at the Centre for Communicable Diseases (CCD) at the International Centre for Diarrhoeal Disease Research, Bangladesh (http://www.icddrb.org/). He attained his bachelor's and master's degrees in statistics from the University of Rajshahi, Bangladesh. He has extensive experience in R programming and Stata, and has good leadership qualities. He has contributed to two books on R and also developed an R package named edeR, short for e-mail data extraction using R, which is available at CRAN (http://cran.r-project.org/web/packages/edeR/index.html). He is currently leading a team of statisticians. He has hands-on experience in developing training material and facilitating training in R programming and Stata, along with statistical aspects in public health research. His primary areas of interest in research include causal inference and machine learning. He is currently involved in several ongoing public health research projects, and is a coauthor of nine peer-reviewed scientific papers. Moreover, he is involved in several work-in-progress manuscripts. He works as a freelance statistician in online marketplaces and has obtained a good reputation for his work.

Hrishi V. Mittal has been working with R for a few years in different capacities. He was introduced to the exciting world of data analysis with R when he was working as a senior air quality scientist at King's College, London, where he used R extensively to analyze large amounts of air pollution and traffic data for London's Mayor's Air Quality Strategy. He has experience in various other programming languages but prefers R for data analysis and visualization. He is also actively involved in various R mailing lists, forums, and the development of some R packages.

Bater Makhabel (LinkedIn: BATERMJ and GitHub: BATERMJ) is a system architect living across Beijing, Shanghai, and Urumqi in China. He received his master's and bachelor's degrees in computer science and technology from Tsinghua University between the years 1995 and 2002. He has extensive experience in machine learning, data mining, natural language processing (NLP), distributed systems, embedded systems, the Web, mobile, algorithms, and applied mathematics and statistics. He has worked for clients such as CA Technologies, META4ALL, and EDA (a subcompany of DFR). He also has experience in setting up start-ups in China. Bater has been balancing a life of creativity between the edge of computer sciences and human cultures. For the past 12 years, he has gained experience in various culture creations by applying various cutting-edge computer technologies, one being a human-machine interface that is used to communicate with computer systems in the Kazakh language. He has previously collaborated with other writers in his fields too, but Learning Data Mining with R is his first official effort.

Edina Berlinger has a PhD in economics from the Corvinus University of Budapest. She is an associate professor, teaching corporate finance, investments, and financial risk management. She is the head of the Finance department of the university, and is also the chair of the finance subcommittee of the Hungarian Academy of Sciences. Her expertise covers loan systems, risk management, and more recently, network analysis. She has led several research projects in student loan design, liquidity management, heterogeneous agent models, and systemic risk.

Ferenc Illes has an MSc degree in mathematics from Eotvos Lorand University. A few years after graduation, he started studying actuarial and financial mathematics, and he is about to pursue his PhD from Corvinus University of Budapest. In recent years, he has worked in the banking industry. Currently, he is developing statistical models with R. His interest lies in large networks and computational complexity.

Milan Badics has a master's degree in finance from the Corvinus University of Budapest. Now, he is a PhD student and a member of the PADS PhD scholarship program. He teaches financial econometrics, and his main research topics are time series forecasting with data-mining methods, financial signal processing, and numerical sensitivity analysis on interest rate models. He won the competition of the X. Kochmeister-prize organized by the Hungarian Stock Exchange in May 2014.

Adam Banai has received his MSc degree in investment analysis and risk management from Corvinus University of Budapest. He joined the Financial Stability department of the Magyar Nemzeti Bank (MNB, the central bank of Hungary) in 2008. Since 2013, he is the head of the Applied Research and Stress Testing department at the Financial System Analysis Directorate (MNB). He is also a PhD student at the Corvinus University of Budapest since 2011. His main research fields are solvency stress-testing, funding liquidity risk, and systemic risk.

Gergely Daroczi is a former assistant professor of statistics and an enthusiastic R user and package developer. He is the founder and CTO of an R-based reporting web application at http://rapporter.net and a PhD candidate in sociology. He is currently working as the lead R developer/research data scientist at https://www.card.com/ in Los Angeles. Besides maintaining around half a dozen R packages, mainly dealing with reporting, Gergely has coauthored the books Introduction to R for Quantitative Finance and Mastering R for Quantitative Finance (both by Packt Publishing) by providing and reviewing the R source code. He has contributed to a number of scientific journal articles, mainly in social sciences but in medical sciences as well.

Rate this ebook

Tell us what you think.

Reading information

Smartphones and tablets
Install the Google Play Books app for Android and iPad/iPhone. It syncs automatically with your account and allows you to read online or offline wherever you are.
Laptops and computers
You can listen to audiobooks purchased on Google Play using your computer's web browser.
eReaders and other devices
To read on e-ink devices like Kobo eReaders, you'll need to download a file and transfer it to your device. Follow the detailed Help Center instructions to transfer the files to supported eReaders.