Manipulating and Pre-Processing Data with R: Statistics with R Series
Contents: 1. Download and installation of R. 2. Data-reading and writing. 3. Data-overview in R. 4. Data manipulations. 5. Descriptive statistics. 6. Data distribution. 7. Detection of outliers. 8. Data-transformation. 9. Homoscedasticity and heteroscedasticity. 10. Text data pre-processing.
We collect data from various sources which have different structure. Before integrating these data for modelling or other purposes, first logical step is the manipulation and pre-processing of data in terms of identifying missing data, detection of outliers, knowing the probability distribution, data transformation, data standardization, etc. These pre-processing techniques aim at improving the quality and accuracy in data interpretation and modelling. However, in many cases, these tests are avoided due to unawareness or due to unavailability of heavily paid softwares.