diff --git a/CourseSessions/Sessions23/Session2inclass_BR.Rmd b/CourseSessions/Sessions23/Session2inclass_BR.Rmd
new file mode 100644
index 00000000..dfb3c0b9
--- /dev/null
+++ b/CourseSessions/Sessions23/Session2inclass_BR.Rmd
@@ -0,0 +1,433 @@
+---
+title: "Sessions 3-4"
+author: "T. Evgeniou"
+output: html_document
+---
+
+
+
+The purpose of this session is to become familiar with:
+
+1. Some visualization tools;
+2. Principal Component Analysis and Factor Analysis;
+3. Clustering Methods;
+4. Introduction to machine learning methods;
+5. A market segmentation case study.
+
+As always, before starting, make sure you have pulled the [session 3-4 files](https://github.com/InseadDataAnalytics/INSEADAnalytics/tree/master/CourseSessions/Sessions23) (yes, I know, it says session 2, but it is 3-4 - need to update all filenames some time, but till then we use common sense and ignore a bit the filenames) on your github repository (if you pull the course github repository you also get the session files automatically). Moreover, make sure you are in the directory of this exercise. Directory paths may be complicated, and sometimes a frustrating source of problems, so it is recommended that you use these R commands to find out your current working directory and, if needed, set it where you have the main files for the specific exercise/project (there are other ways, but for now just be aware of this path issue). For example, assuming we are now in the "MYDIRECTORY/INSEADAnalytics" directory, we can do these:
+
+```{r echo=TRUE, eval=FALSE, tidy=TRUE}
+getwd()
+setwd("CourseSessions/Sessions23")
+list.files()
+rm(list=ls()) # Clean up the memory, if we want to rerun from scratch
+```
+As always, you can use the `help` command in Rstudio to find out about any R function (e.g. type `help(list.files)` to learn what the R function `list.files` does).
+
+Let's start.
+
+
The purpose of this exercise is to become familiar with:
+While doing this exercise we will also see how to generate replicable and customizable reports. For this purpose the exercise uses the R Markdown capabilities (see Markdown Cheat Sheet or a basic introduction to R Markdown). These capabilities allow us to create dynamic reports. For example today’s date is 2016-01-26 (you need to see the .Rmd to understand that this is not a static typed-in date but it changes every time you compile the .Rmd - if the date changed of course).
+Before starting, make sure you have pulled the exercise files on your github repository (if you pull the course github repository you also get the exercise set files automatically). Moreover, make sure you are in the directory of this exercise. Directory paths may be complicated, and sometimes a frustrating source of problems, so it is recommended that you use these R commands to find out your current working directory and, if needed, set it where you have the main files for the specific exercise/project (there are other ways, but for now just be aware of this path issue). For example, assuming we are now in the “MYDIRECTORY/INSEADAnalytics” directory, we can do these:
+getwd()
+setwd("Exercises/Exerciseset1/")
+list.files()
+Note: you can always use the help command in Rstudio to find out about any R function (e.g. type help(list.files) to learn what the R function list.files does).
Let’s now see the exercise.
+IMPORTANT: You should answer all questions by simply adding your code/answers in this document through editing the file ExerciseSet1.Rmd and then clicking on the “Knit HTML” button in RStudio. Once done, please post your .Rmd and html files in your github repository.
+We download daily prices (open, high, low, close, and adjusted close) and volume data of publicly traded companies and markets from the web (e.g. Yahoo! or Google, etc). This is done by sourcing the file data.R as well as some helper functions in herpersSet1.R which also installs a number of R libraries (hence the first time you run this code you will see a lot of red color text indicating the download and installation process):
+source("helpersSet1.R")
+source("dataSet1.R")
+[1] “ticker SPY …” [1] “ticker AAPL …”
+source("dataSet2.R")
+[1] “ticker SPY …” [1] “ticker AAPL …”
+For more information on downloading finance data from the internet as well as on finance related R tools see these starting points (there is a lot more of course available):
+We have 2783 days of data, starting from 2005-01-04 until 2016-01-25. Here are some basic statistics about the S&P returns:
+Here are returns of the S&P in this period (note the use of the helper function pnl_plot - defined in file helpersSet1.R):
+Your Answers here:
line 96: ExerciseSet1.Rmd:write.csv(StockReturns[1:20,c(“SPY”,“AAPL”)], file = “twentydays.csv”, row.names = TRUE, col.names = TRUE)
line 8: dataSet1.R:mytickers = c(“SPY”, “AAPL”) # Other tickers for example are “GOOG”, “GS”, “TSLA”, “FB”, “MSFT”,
For this part of the exercise we will do some basic manipulations of the data. First note that the data are in a so-called matrix format. If you run these commands in RStudio (use help to find out what they do) you will see how matrices work:
+class(StockReturns)
+dim(StockReturns)
+nrow(StockReturns)
+ncol(StockReturns)
+StockReturns[1:4,]
+head(StockReturns,5)
+tail(StockReturns,5)
+We will now use an R function for matrices that is extremely useful for analyzing data. It is called apply. Check it out using help in R.
+For example, we can now quickly estimate the average returns of S&P and Apple (of course this can be done manually, too, but what if we had 500 stocks - e.g. a matrix with 500 columns?) and plot the returns of that 50-50 on S&P and Apple portfolio:
+We can also transpose the matrix of returns to create a new “horizontal” matrix. Let’s call this matrix (variable name) transposedData. We can do so using this command: transposedData = t(StockReturns).
help(apply)), can you create again the portfolio of S&P and Apple and plot the returns in a new figure below?nrow(transposedData) and ncol(transposedData).This is an important step and will get you to think about the overall process once again.
+startDate = "2001-01-01" in Line 11.
+Finally, one can read and write data in .CSV files. For example, we can save the first 20 days of data for S&P and Apple in a file using the command:
+write.csv(StockReturns[1:20,c("SPY","AAPL")], file = "twentydays.csv", row.names = TRUE, col.names = TRUE)
+Do not get surpsised if you see the csv file in your directories suddenly! You can then read the data from the csv file using the read.csv command. For example, this will load the data from the csv file and save it in a new variable that now is called “myData”:
+myData <- read.csv(file = "twentydays.csv", header = TRUE, sep=";")
+Try it!
+sum(myData != StockReturns[1:20,])myData + StockReturns[1:40,])
+Can you now load another dataset from some CSV file and report some basic statistics about that data?
+Finally, just for fun, one can add some interactivity in the report using Shiny.All one needs to do is set the eval flag of the code chunk below (see the .Rmd file) to “TRUE”, add the line “runtime: shiny” at the very begining of the .Rmd file, make the markdown output to be “html_document”, and then press “Run Document”.
+sliderInput("startdate", "Starting Date:", min = 1, max = length(portfolio),
+ value = 1)
+sliderInput("enddate", "End Date:", min = 1, max = length(portfolio),
+ value = length(portfolio))
+
+renderPlot({
+ pnl_plot(portfolio[input$startdate:input$enddate])
+})
+This is a recent research article that won an award in 2016. Can you implement a simple strategy as in Figure 1 of this paper? You may find these R commands useful: names, which, str_sub,diff,as.vector, length, pmin, pmax, sapply, lapply,Reduce,unique, as.numeric, %in%
What if you also include information about bonds? (e.g. download the returns of the the ETF with ticker “TLT”) Is there any relation between stocks and bonds?
+Have fun
+