Links to all tutorial articles (same as those on the Exam pages)VaR disaggregation in R  Marginal and Component VaR
This article is a practical and real example of how valueatrisk, marginal VaR, betas, undiversified VaR and component VaRs are calculated. Credit to Pawel Lachowicz for the idea for this article, he has done the same thing in Python here: http://www.quantatrisk.com/2015/01/18/appliedportfoliovalueatriskdecomposition1marginalandcomponentvar/
The level of detail here may not be required for the PRM exam, for that just the previous tutorial will do (the one with the formulae and explanations). This article will be useful for practising risk analysts  we will extract real endofday prices, and perform our analysis on those in R. To run the R code, you will need to install R on your machine. The good thing with R is that it does not require administrative privileges, so if you can download it from Cran, you can run it. If you are looking for an introduction to R, you can look at this 20 minute video I posted a few years ago: https://www.youtube.com/user/riskprep
Hits: 3357
VaR disaggregation  Marginal and Component VaR
This article covers how Marginal and Component VaRs are calculated. We follow up (in a separate article) with a real life example of how VaR, MVaR, undiversified VaR, Component VaR are calculated  based on actual price data pulled from Quandl, an open source market data website.
Hits: 12131
Understanding Kurtosis
The topic of kurtosis can cause some confusion. Think fat and thin tails, and "peakedness" and leptokurtosis. How is it that having a ‘pointier’ peak means having fat tails, and how is that different from a normal distribution that just has a smaller standard deviation? And how does a tdistribution manage to have fatter tails when its peak can in fact be lower than that of the normal distribution?
Hits: 27969
Understanding Principal Component Analysis (PCA)
Before we even start on Principal Component Analysis, make sure you have read the tutorial on Eigenvectors et al here.
This article attempts to provide an intuitive understanding of what PCA is, and what it can do. By no means is this a mathematical treatise, nor am I a mathematician. This is an attempt to share how I understand PCA, and how you might internalize it.
The problem with too many variables
The problem we are trying to solve with PCA is that when we are trying to look for relationships in data, there may sometimes be too many variables for doing anything meaningful. What PCA allows us to do is to replace a large number of variables with much fewer ‘artificial’ variables that effectively represent the same data. These artificial variables are called principal components. So you might have a hundred variables in the original data set, and you may be able to replace them with just two or three mathematically constructed artificial variables that explain the data just about as well as the original data set. Now these artificial variables themselves are built mathematically (and I will explain that part in a bit), and are linear combinations of the underlying original variables. These new artificial variables, or principal components, may or may not be capable of any intuitive human interpretation. Sometimes they may just be combinations of the underlying variables in a way that make no logical sense, and sometimes they do provide a better conceptual understanding of the underlying variables (as fortunately they do in the analysis of interest rates and for many other applications as well).
Let us take an example. Imagine a stock analyst who is looking at the returns from a large number of stocks. There are many variables he/she is looking at for the companies he is analyzing, in fact dozens of them. He has all of his data in a large spreadsheet, where each row contains a single observation (for different quarters, months, years or whatever way the data is organized). The columns read something like the list below. The data can be thought of as a T x n matrix, where T is the rows, with each row being an observation at a point in time, and n being the number of data fields.
Company name
PE ratio (Price Earnings)
NTM earnings (Next 12 months consensus earnings)
Revenue
Debt
EPS (Earning per share)
EBITDA (Earnings before interest, tax, depreciation and amortization)
LQ Earnings (Last quarter's earnings)
Short Ratio
Beta
Cash flow
…etc up to column n.
If the analyst wants to analyze the relationship of the above variables to returns, it is not an easy task as the number of variables involved is not manageable. Of course, many of these variables are related to each other, and are highly correlated. For example, if revenues are high, EBITDA will be high. If EBITDA is high, Net Income will be high too, and if Net Income is high, so will be the EPS etc. Similar linkages and correlations can be seen with cash flows, LQ earnings etc. In other words, this represents a highly correlated system.
So what we can do with this data is to reduce the number of variables by condensing some of the correlated variables together into one single representation (or an artificial variable) called a ‘principal component’. How good is this principal component as a representation of the underlying data it represents? That question is answered by calculating the extent of variation in the original data that is captured by the principal component. (All these calculations, including how principal components are identified, are explained later in this article, but for the moment let us just go along to understand conceptually what PCA is.) The number of principal components that can be identified for any dataset is equal to the number of the variables in the dataset. But if one had to use all the principal components it would not be very helpful because the complexity of the data is not reduced at all, and in fact is amplified because we are replacing natural variables with artificial ones that may not have a logical interpretation. However, the advantage that principal component identification brings is that we can decide which principal components to use and which to discard. Each principal component accounts for a part of the total variation that the original dataset had. We pick the top 2 or 3 (or n) principal components so we have a satisfactory proportion of the variation in the original dataset.
At this point, let us look at what this ‘total variation’ means. Think of the data set as a scatterplot. If we had two variables, think about how they would look when plotted on a scatter plot. If we had three variables, try to visualize a three dimensional plane and how the data points would look – like a cloud kind of clustering together a little bit (or not) depending upon how correlated the system is. The ‘spread’ of this cloud is really the ‘variation’ contained in the data set. This can be measured in the form of variance, with each of the n columns having a variance. When all the principal components have been calculated, each of the principal components has a variance. We arrange the principal components in descending order of the variance each of them explains, take the top few principal components, add up their variance, and compare it to the total variance to determine how much of the variance is accounted for. If we have enough to meet our needs, we stop there, otherwise we can pick the next principal component in our analysis too. In finance, PCA is often performed for interest rates, and generally the top three components account for nearly 99% of the variance allowing us to use the just those three instead of the underlying 50100 variables that arise from the various maturities (1 day to 30 years, or more).
PCA in practice
PCA begins with the covariance (or correlation) matrix. First, we calculate the covariance of all the original variables and create the covariance matrix.
For this covariance (or correlation) matrix, we now calculate the eigenvectors and eigenvalues. (This can be done using statistical packages, and also some Excel addins.)
Every eigenvector would be a column vector with as many elements as the number of variables in the original dataset. Thus if we had an initial dataset of the size T x n (recall: rows are the observations, columns represent variables, therefore we have T observations of n variables), the covariance matrix would be of the size n x n, and each of the eigenvectors will be n x 1.
The eigenvalues for each of the eigenvectors represent the amount of variance that the given eigenvector accounts for. We arrange the eigenvectors in decreasing order of the eigenvalues, and pick the top 2, 3 or as many eigenvalues that we are interested in depending upon how much variance we want to capture in our model. If we include all the eigenvectors, then we would have captured all the variance but this would not give us any advantage over our initial data.
In a simplistic way, that is about all that there is to PCA. If you are reading the above for the first time, I would understand if it all gobbledygook. But no worries, we are going to go through an example to illustrate exactly how all of the above is done.
Example:
Let us consider the following data that explains the above. This is just a lot of data, typical stuff of the kind that you might encounter at work. Let us assume that this is data relating to stocks, with the symbols appearing in column A, and various variables relating to the symbol on the right. What we want to do is to perform PCA on this data and reduce the number of variables from 8 to something more manageable. We believe this can be done because if we look at the variables, some are obviously quite closely related. Higher revenues likely mean a higher EBITDA, and a higher EBITDA probably means a higher tev (total enterprise value). Similarly, a higher price (col C) probably means a higher market cap too – though of course we can not be sure unless we analyze the data. But it does appear that the data is related together in a way that we can represent all of this fairly successfully using just a few ‘artificial variables’, or principal components. Let us see how we can do that.
Based on the above, we can calculate the correlation matrix or the covariance matrix – either manually, or in one easy step using Excel’s data analysis feature.
The problem of scale – (ie, when to use the correlation matrix and when the covariance matrix?)
Let us get this question out of the way first. Consider two variables for which the units of measure differ significantly in terms of scale. For example, if one of the variables is in dollars and another in millions of dollars, then the scale of the variables will affect the covariance. You will find that the covariance will be significantly affected by the variable that has larger numerical quantities. The variable with the smaller numbers – even though this may be the more important number – will be overwhelmed by the other larger numbers in what it contributes to the covariance. One way to overcome this problem is to normalize (or standardize) the values by subtracting the mean and dividing by standard deviation (ie, obtain the zscore, equal to (x – μ)/σ)) and replacing all observations by such zscores.
We can then use this normalized set of observations to perform further analysis because now everything will be expressed on a standard unitless scale in a way that the mean is zero and standard deviation is 1 for both sets of observations. Now if for this set of normalized observations you will find that the correlation and covariance are identical. Why? Think for a minute about the conceptual difference between correlation and covariance. Covariance is in the units of both the variables. In a case where observations have been normalized, they do not have any units already. So you end up with a unitless covariance which is nothing but correlation. Formulaically, correlation is covariance divided by the standard deviations of the two variables. In the case where the observations have been normalized, the standard deviation of both the variable is 1 and dividing covariance by 1 leaves us with the same number. In other words, correlation is really nothing but covariance normalized to a standard scale.
Because covariance includes the units of both the variables, it is affected by the scale. Basing PCA on a correlation matrix is identical to using standardized variables. Therefore in situations where scale is important and varies a great deal between the variables, correlation matrices be preferable. In other cases where the units are the same and the scale does not vary widely, covariance matrices would do just fine. If we run PCA on the same data once using a correlation matrix and another time using a covariance matrix, the results will not be identical. Which one to use may ultimately be a question of judgment for the analyst.
In this case, we decide to proceed with the correlation matrix, though we could well have used the covariance matrix as all the variables are in dollars. However, for the mechanics of calculating the principal components that I am trying to demonstrate, this does not matter – so we will proceed with the correlation matrix.
The correlation matrix appears as follows. The top part is what the Excel output looks like (Data Analysis, correlation. If you don’t have Data Analysis available in your version of Excel, that is probably because you haven’t yet installed the data analysis plug in. Search Google on how to do that). The lower part is merely the same thing with the items above the upper diagonal filled in.
And there will be n eigenvectors. It is acceptable to put all the column vectors for the eigenvectors in one big n x n matrix too (which is how you will find it in some of the textbooks, and also below).
The eigenvectors multiplied by the underlying variables represent the principal components. Thus if an eigenvector is say [0.25,0.23,0.22,0.30,0.40]1, and the 5 variables in our analysis are v1, v2, v3, v4 and v5, then the principal component represented by that eigenvector is = 0.25v1 + 0.23v2 + 0.22v3 + 0.30v4 + 0.40v5. The other principal components are similarly calculated using the other eigenvectors.
The eigenvalues for each of the eigenvectors represent the proportion of variation captured by that eigenvector. The total variation is given by the sum of all the eigenvectors. So if the eigenvalue for a principal component is 2.5 and the total of all eigenvalues is 5, then this particular principal component captures 50% of the variation.
We decide how many principal components to keep based upon the amount of variation they account for. For example, we may select all principal components above a certain threshold of contribution to accounting for variation, or all principal components whose eigenvectors have an eigenvalue greater than 1.
We can now represent the original variables as a function of the principal components – each original variable is equal to a linear combination of the principal components
What is ‘Total variance’ that we talk about?
The total variance in the data set is nothing but the mathematical aggregation of the variance of each of the variables. If the observations have been normalized, then the variance of each of the variables will be 1, and therefore the total variance will be 1 x n = n, where n is the number of variables. Once principal components have been computed, there will be a total of n principal components for n variables. The important thing to note is that the total of the variances of the principal components will be equal to the total variance of the observations. This allows us to pick the more relevant principal components by picking the ones with the most variance and ignoring the ones with the smaller variances, and still be able to cover most of the variation in the data set.
Calculating eigenvalues and eigenvectors
There is no easy way I know of that can help us calculate eigenvalues and eigenvectors in Excel. For our hypothetical example, the eigenvectors and the eigenvalues are given below. (These were calculated in a different statistical package (R, which is free to use, and can be downloaded from www.rproject.org)).
Notice that the total of the eigenvalues is 8 – which is the same as the number of variables. Because we effectively normalized the variables by using the correlation matrix, the total of our eigenvalues is 8 which is the sum of the individual normalized variances (=1) of each of the eight variables. (If we had used the covariance matrix, the eigenvalues would have added to whatever the sum of the variances of each individual variable would have been.)
We also see that just the first 2 principal components account for nearly 75% of the variance. If we are comfortable with the simplicity that 2 variables offer instead of 8 at a cost of losing 25% of the variation in the data, we will use 2 principal components. If not, we can extend our model to include the third principal component which brings the total variance accounted for to nearly 88%.
Is Aw = λw?
Do our eigenvalues and eigenvectors satisfy Aw = λw, where w is the eigenvector, A is a square matrix, w is a vector and λ is a constant? Let us test that. In this case, A is our correlation matrix, w is the eigenvector and λ is the eigenvalue. We find this relationship to be true for all eigenvectors.
Constructing principal components from eigenvectors
We derive the principal components from the eigenvectors as follows.
PC1 = 0.02070*quantity 0.14822*entry_price 0.04413*profit_dollar 0.47031*market_cap 0.37456*cash_and_marketable 0.47874*tev 0.46308*revenues 0.41295*ebitda
PC2 = 0.60642*quantity 0.63912*entry_price 0.33582*profit_dollar 0.00458*market_cap 0.30794*cash_and_marketable 0.01122*tev 0.04846*revenues 0.11699*ebitda
PC3 = 0.44257*quantity 0.16687*entry_price 0.81688*profit_dollar 0.00946*market_cap 0.23303*cash_and_marketable 0.03034*tev 0.08786*revenues 0.21437*ebitda
And so on.
How do we use the principal components for further analysis? For any observation, we set up a routine to calculate the principal component equivalent and use that instead of the original variables – whether for regression, or any other kind of analysis. So PCA is merely a means to an end – it does not give you an ‘answer’ that you can use right away.
Interpreting principal components
We look at the coefficients assigned to each of the principal components, and try to see if there is a common thread between the factors (in this case, quantity, entry_price, profit_dollar etc are the ‘factors’). The common thread could be an underlying common economic cause, or other explanation that makes them similar. In our hypothetical example, we find that the first principal component is heavily “loaded” on the last 5 factors. The second principal component is similarly “loaded” on the first three, and also a bit on the fifth one. (You can see this by examining the eigenvectors for the respective principal component.)
In the case of interest rates, by looking at how the principal components are constructed, we find that the first 3 principal components are called the trend, the tilt and the curvature components.
That is all for PCA folks! Hope the above made sense and is helpful. Feel free to reach out to me by email if you think something above is not correct or can be improved, or if you just have a comment or a question.
This article attempts to provide an intuitive understanding of what PCA is, and what it can do. PRMIA has been asking questions on PCA, but the way the subject is presented in the Handbook is not appropriate for someone who has not studied it before in the classroom. This article aims to provide an intuitive understanding of what PCA is so you can approach the material in the handbook with confidence.
But before we even start on Principal Component Analysis, make sure you have read the tutorial on Eigenvectors et al here. Otherwise much of this is not going to make any sense.
Hits: 49273
Regression Analysis
Linear regression is an important concept in finance and practically all forms of research. It is also used extensively in the application of data mining techniques. This article provides an overview of linear regression, and more importantly, how to interpret the results provided by linear regression. We will discuss understanding regression in an intuitive sense, and also about how to practically interpret the output of a regression analysis. In particular, we will look at the different variables such as pvalue, tstat and other output provided by regression analysis in Excel. We will also look at how regression is connected to beta and correlation.
Hits: 91062
Modeling interest rate changes
Modeling the behavior of short term interest rates
There are a number of ways that are used for modeling short term interest rates. These have not been covered in the PRMIA handbook, but they find a reference in one of their study guides. So just to be cautious, a bit of explanation for these is provided here in case there are questions in the exam relating to these concepts.
What should you know? Well, in all likelihood you will not be asked a question on this topic, but if you have 20 minutes, have a read and you will be slightly better prepared. Recognize the formulae for the 5 models mentioned here, and try to remember some of the differences between the noarbitrage and equlibrium models. Or you could entirely skip this as well. I will be putting a few questions in on this subject, but will clearly mark them as unlikely.
Hits: 9329
Volatility, returns and the behavior of stock prices
This article follows the earlier tutorial on stochastic processes. It talks about a couple of related things, and tries to provide an intuitive understanding of some of the following concepts:
Hits: 17449
CreditRisk+, or the actuarial approach to measuring credit risk
This is the final of five articles  each explaining at a high level one each of the five credit risk models in the PRMIA handbook. This writeup deals with the actuarial, or the 'CreditRisk+' model.Credit Risk +, or the actuarial approach
Hits: 10615
The KMV approach to measuring credit risk
This is the fourth of five articles covering each of the main portfolio approaches to credit risk as explained in the handbook. The idea is to provide a high level and concise explanation of each of the approaches so it may be easier to deal with the detail provided in the handbook. I hope you find it useful.
Hits: 34688
The structural approach to credit risk
This is the third of five articles covering credit risk  this one addresses the 'structural approach'.
Hits: 11202
Credit Portfolio View
This is the second of five articles that discuss the various approaches to measuring credit risk in a portfolio. This article covers CreditPortfolio view. CreditPortfolio View is conceptually not too dissimilar from the Credit Metrics model described earlier, ie it relies upon a knowledge of the transition matrices between the different credit ratings. The only difference is that the transition matrix itself has an adjustment applied to it for the business cycle. But once this adjusted transition matrix has been obtained, the rest of the process works in the same way as for the Credit Metrics model.
Hits: 13061
Quick primer on Black Scholes
The conceptual idea behind Black Scholes is rather simple – but as the argument advances beyond the initial idea, things become more complex with differential equations, riskneutrality and log returns stepping in. For the PRMIA exam, you will not be asked for a derivation of Black Scholes, so it may suffice to know just a couple of things. This brief writeup aims to summarize just those few things.
Hits: 10532
Eigenvectors, eigenvalues and orthogonality
This is a quick write up on eigenvectors, eigenvalues, orthogonality and the like. These topics have not been very well covered in the handbook, but are important from an examination point of view.
Hits: 45238
Understanding convexity: first and second derivatives of a price function
First and second derivatives are important in finance – in particular in measuring risk for fixed income and options. In fixed income – the first and second derivatives are modified duration and convexity respectively, and for options, these are delta and gamma. But what do these really mean – and what does one think about them when one sees a number? The rest of this article attempts to provide an intuitive look at how price changes for a bond (or an option) are determined by the first and the second derivative, what they mean, and how they are to be interpreted.
Hits: 44452
Credit Migration Framework
This is the first of five articles that provide a high level understanding of the various portfolio models of credit risk covered in the PRMIA syllabus. Being the first one, this discusses the credit migration framework. (I am still working on the others.) This article is intended to provide a conceptual understanding of the approach and I have not provided numerical examples for the reason that I don’t want to duplicate what is already there in the Handbook. Once you have read this, the scattered explanation in the Handbook will hopefully make more sense.
Hits: 12502
Combining expected values and variances
When constructing portfolios we are often concerned with the return (ie the mean, or expected value), and the risk (ie the volatility, or standard deviation) of of combining positions or portfolios. We may also be faced with situations where we need to know the risk and return if position sizes were to be scaled up or down in a linear way. This brief article deals with how mean and variances for two different variables can be combined together, and how they react to being added or multiplied by constants.
Hits: 25270
Default Correlations
This is a brief article on default correlations – what it means, and how to interpret it. To keep things simple, let us consider only two securities– A and B. Let us look at how default correlations are calculated, and then try to think about how to intuitively interpret a given default correlation number between two securities.
Hits: 19404
Credit VaR  an intuitive understanding
This brief article intends to clarify the differences between some concepts relating to credit VaR. One thing to note about credit risk is that you need to watch out whether you are inferring VaR from a distribution of the value of the portfolio, or from a distribution of the losses in the portfolio. One is a mirror image of the other, they give the same results, but they are not identical and you should intuitively understand the difference.
Hits: 40032
Capital tiers under Basel II
The constituents of capital under Basel II
Hits: 32361
More about continuous compounding
It is important to understand continuously compounded rates. These rates are rarely encountered in daytoday life, but are relevant to a finance professional. You will never see, for example, a bank advertise ‘continuously compounded rates’ for its deposits. (In fact, it may even be against the law to do so as they may be required to disclose easier to understand APRs).
To understand continuously compounded rates, think about natural processes. Think about how, for example, population grows. It does not grow in discrete steps. It just grows all the time.
Hits: 10996
Descriptive stats for the PRMIA exams
This is a very brief article, perhaps unjust given what it covers. I have tried to keep it very short, so as to be a practical reference to key statistical terms that are used throughout risk management. This covers standard deviation, variance, covariance, correlation, regression and the famous 'square root of time' rule. The PRMIA handbook has more stuff but this covers the key things you must know  almost by heart!
Hits: 5537
VaR and heavy tails
Value at risk is affected by tails and there is so much stuff in the PRMIA handbook about dealing with heavy tails. This can be confusing as the handbook sort of presumes an understanding of how tails affect VaR  so here is a short tutorial to explain how heavy tails affect Value at Risk.
Hits: 6573
Distributions in finance
A lot of finance and risk management is about distributions. For the PRMIA exam, you really need to understand the concepts underlying distributions, what different shapes mean, what the parameters are, what a cdf is (vs a pdf) and which to use when. Of course, the most commonly used distribution assumption is that returns are normally distributed, so this article talks about the normal distribution and also other important distributions. More importantly, I have provided spreadsheets that model each of the distributions so you can play around and see the behaviour of the distribution as you change the parameters underlying it.
Hits: 19838
A refresher on logarithms
A quick refresher on logarithms. Just the basics, and not a whole lot more. Explains what logarithm means, how it is related to e, how to add/subtract logs and convert bases.
Hits: 5288
Sequences: Arithmetic and Geometric Progressions
Quick refresher on arithmetic and geometric progressions  straightforward, and lists just the formulae. Not a whole lot.
Hits: 3695
Interest rates and continuous compounding
If you are new to finance, or haven't actually done much math in a while, the differences between discrete, compounded and continuously compounded interest rates can be quite confusing. You may go through many chapters in the handbook while still having a nagging doubt as to if you really get the interest rate part  sometimes they use (1+r)^n, at other times it is exp(rn), what's going on? This brief article explains what continuously compounded interest rates are, how they work and how they are to be used.
Hits: 50395
Calculating forward exchange rates  covered interest parity
An easy hit in the PRMIA exam is getting the question based on covered interest parity right. It will come with a couple of exchange rates, interest rates and dates, and there would be one thing missing that you will be required to calculate. This brief write up attempts to provide an intuitive understanding of how and why covered interest parity works. There are a number of questions relating to this that I have included in the question pool, and this article addresses the key concepts with some examples.
Hits: 157379
Modeling portfolio variance in Excel
This article is about an Excel model for calculating portfolio variance. When it comes to calculating portfolio variance with just two assets, life is simple. But consider a situation when there are 10, 15, maybe hundreds of assets. This brief article is a practical demonstration of how portfolio variance can be modeled in Excel  the underlying math, and an actual spreadsheet for your playing pleasure! Enjoy!
Hits: 130792
Option strategies
A brief discussion of option strategies relevant to the PRM exam
Hits: 4062
Understanding option Greeks
Option Greeks are tricky beasts with their complex formulae and intimidating names. This article is not about the formulae at all. The idea is to discuss what each of the Greeks represent, and understand what drives each of them. Often the PRMIA exam will ask a question about a Greek, and very likely that question will expect you to understand the relationship between the variable and the underlying asset, and how the Greeks can be used to measure and manage risks.
Hits: 13675
Valuing an option
This rather brief article covers some familiar theory  how are options valued  but without repeating all the text you can find in a textbook. Of course, we all know about the Black Scholes model, and anyone can punch it into Excel or use any number of online calculators available for free. What this article tries to do is to provide an intuitive understanding of what drives option values, and also provides an Excel model incorporating the Black Scholes that you could use to play around with.
Hits: 4132
Introduction to vanilla options
In this article we will discuss basic vanilla options, calls and puts, and understand payoff diagrams. We will also look at the putcall parity. The putcall parity is important to understand for the PRMIA exam as a number of questions, such as those relating to the relationship between call and put values, the additive nature of option Greeks is based upon the putcall parity.
Hits: 7671
Risk adjusted performance measures
Returns are the reward for taking risk: when there will be no risk, there will be no profits either. This article discusses the Sharpe ratio, Treynor ratio, Information Ratio, Jensen’s alpha and the Kappa indices, which are all measures to evaluate risk adjusted performance.
Hits: 21001
Stochastic processes
Stochastic processes
Hits: 8853
