R Programming

Hadoop & R
Programming
Hadoop & R Programmming
Strings, Vectors, List, Arrays

Strings
Any value written within a pair of single quote or double quotes in R is treated as a string.
Internally R stores every string within double quotes, even when you create them with a single
quote.
Rules Applied in String Construction
•The quotes at the beginning and end of a string should be both double quotes or both single
quotes. They can not be mixed.
•Double quotes can be inserted into a string starting and ending with a single quote.
•Single quotes can be inserted into a string starting and ending with double quotes.
•Double quotes can not be inserted into a string starting and ending with double quotes.
•Single quote can not be inserted into a string starting and ending with a single quote.
Strings
Examples of Valid Strings
Following examples clarify the rules about creating a string in R.

a <- 'Start and end with single quote'
print(a)
b <- "Start and end with double quotes"
print(b)
c <- "single quote ' in between’ double quotes"
print(c)
d <- 'Double quotes " in between” single quote'
print(d)
Strings
Examples of Invalid Strings
Following examples clarify the rules about creating a string in R.

e <- 'Mixed quotes"
print(e)
f <- 'Single quote ' inside single quote'
print(f)
g <- "Double quotes " inside double quotes"
print(g)
Output
Error: unexpected symbol in:
"print(e)
f <- 'Single"
Execution halted
Strings
String Manipulation
Concatenating Strings - paste() function
Many strings in R are combined using the paste() function. It can take any number of
arguments to be combined together.
Syntax
The basic syntax for paste function is −
paste(..., sep = " ", collapse = NULL)
Following is the description of the parameters used −
1. ... represents any number of arguments to be combined.
2. sep represents any separator between the arguments. It is optional.
3. collapse is used to eliminate the space in between two strings. But not the space within two
words of one string.
Strings
String Manipulation
Program:
a <- "Hello"
b <- 'How'
c <- "are you? "
print(paste(a,b,c))
print(paste(a,b,c, sep = "-"))
print(paste(a,b,c, sep = "", collapse = ""))
Program output:
[1] "Hello How are you? "
[1] "Hello-How-are you? "
[1] "HelloHoware you? "
Strings
Formatting numbers & strings - format() function
Numbers and strings can be formatted to a specific style using format() function.
Syntax
The basic syntax for format function is −
format(x, digits, nsmall, scientific, width, justify = c("left", "right", "centre", "none"))
x is the vector input.
digits is the total number of digits displayed.
nsmall is the minimum number of digits to the right of the decimal point.
scientific is set to TRUE to display scientific notation.
width indicates the minimum width to be displayed by padding blanks in the beginning.
justify is the display of the string to left, right or center.
Strings
# Total number of digits displayed. Last digit rounded off.
result <- format(23.123456789, digits = 9)
print(result)
# Display numbers in scientific notation.
result <- format(c(6, 13.14521), scientific = TRUE)
print(result)
# The minimum number of digits to the right of the decimal point.
result <- format(23.47, nsmall = 5)
print(result)
# Format treats everything as a string.
result <- format(6)
print(result)
Strings
# Numbers are padded with blank in the beginning for width.
result <- format(13.7, width = 6)
print(result)
# Left justify strings.
result <- format("Hello", width = 8, justify = "l")
print(result)
# Justify string with center.
result <- format("Hello", width = 8, justify = "c")
print(result)
Strings
[1] "23.1234568"
[1] "6.000000e+00" "1.314521e+01"
[1] "23.47000"
[1] "6"
[1] " 13.7"
[1] "Hello "
[1] " Hello "

Strings
4. Changing the case - toupper() & tolower() functions
These functions change the case of characters of a string.
Syntax
The basic syntax for toupper() & tolower() function is −
toupper(x)
tolower(x)
x is the vector input.

Strings
4. Changing the case - toupper() & tolower() functions
Example
# Changing to Upper case.
result <- toupper("Changing To Upper")
print(result)
# Changing to lower case.
result <- tolower("Changing To Lower")
print(result)
1] "CHANGING TO UPPER"
[1] "changing to lower"

Strings
5. Extracting parts of a string - substring() function
This function extracts parts of a String.
Syntax
The basic syntax for substring() function is −
substring(x,first,last)
x is the character vector input.
first is the position of the first character to be extracted.
last is the position of the last character to be extracted.

Strings
5. Extracting parts of a string - substring() function
Example:
•# Extract characters from 5th to 7th position.

•result <- substring("Extract", 5, 7)
print(result)
[1] "act"
Vectors?
A. Vectors are the most basic R data objects and there are six types of atomic vectors. They are
logical, integer, double, complex, character and raw.
Vector Creation
Single Element Vector
Even when you write just one value in R, it becomes a vector of length 1 and belongs to one of
the above vector types.
# Atomic vector of type character.
print("abc");
# Atomic vector of type double.
print(12.5)
# Atomic vector of type integer.
print(63L)
Vectors?
# Atomic vector of type logical.
print(TRUE)
# Atomic vector of type complex.
print(2+3i)
# Atomic vector of type raw.
print(charToRaw('hello’))
When we execute the above code, it produces the following result −
•[1] "abc"
•[1] 12.5
•[1] 63
•[1] TRUE
•[1] 2+3i
•[1] 68 65 6c 6c 6f
Vectors?
Multiple Elements Vector
Using colon operator with numeric data
# Creating a sequence from 5 to 13.
v <- 5:13
print(v)
# Creating a sequence from 6.6 to 12.6.
v <- 6.6:12.6
print(v)
# If the final element specified does not belong to the sequence then it is discarded.
v <- 3.8:11.4
print(v)
•[1] 5 6 7 8 9 10 11 12 13
•[1] 6.6 7.6 8.6 9.6 10.6 11.6 12.6
•[1] 3.8 4.8 5.8 6.8 7.8 8.8 9.8 10.8
Vectors?
Using sequence (Seq.) operator
# Create a vector with elements from 5 to 9 incrementing by 0.4.
print(seq(5, 9, by = 0.4))
[1] 5.0 5.4 5.8 6.2 6.6 7.0 7.4 7.8 8.2 8.6 9.0
Vectors?
Using the c() function
The non-character values are coerced to character type if one of the elements is a character.
# The logical and numeric values are converted to characters.
s <- c('apple','red',5,TRUE)
print(s)
[1] "apple" "red" "5" "TRUE"

Vectors
Accessing Vector Elements
•Elements of a Vector are accessed using indexing.
•The [ ] brackets are used for indexing. Indexing starts with position 1.
•Giving a negative value in the index drops that element from result.
•TRUE, FALSE or 0 and 1 can also be used for indexing.

# Accessing vector elements using position.
t <- c("Sun","Mon","Tue","Wed","Thurs","Fri","Sat")
u <- t[c(2,3,6)]
print(u)
# Accessing vector elements using logical indexing.
v <- t[c(TRUE,FALSE,FALSE,FALSE,FALSE,TRUE,FALSE)]
print(v)
Vectors
Accessing Vector Elements
# Accessing vector elements using negative indexing.
x <- t[c(-2,-5)]
print(x)
# Accessing vector elements using 0/1 indexing.
y <- t[c(0,0,0,0,0,0,1)]
print(y)
[1] "Mon" "Tue" "Fri"
[1] "Sun" "Fri"
[1] "Sun" "Tue" "Wed" "Fri" "Sat"
[1] "Sun"
Vector Manipulation
Vector arithmetic
Two vectors of same length can be added, subtracted, multiplied or divided giving the result as a
vector output.
# Create two vectors.
v1 <- c(3,8,4,5,0,11)
v2 <- c(4,11,0,8,1,2)
# Vector addition.
add.result <- v1+v2
print(add.result)
# Vector subtraction.
sub.result <- v1-v2
print(sub.result)
# Vector multiplication.
multi.result <- v1*v2
print(multi.result)
Vector Manipulation
# Vector division.
divi.result <- v1/v2
print(divi.result)

[1] 7 19 4 13 1 13
[1] -1 -3 4 -3 -1 9
[1] 12 88 0 40 0 22
[1] 0.7500000 0.7272727 Inf 0.6250000 0.0000000 5.5000000
Vector Manipulation
# Vector Element Recycling
If we apply arithmetic operations to two vectors of unequal length, then the elements of the
shorter vector are recycled to complete the operations.
v1 <- c(3,8,4,5,0,11)
v2 <- c(4,11)
# V2 becomes c(4,11,4,11,4,11)
add.result <- v1+v2
print(add.result)
sub.result <- v1-v2
print(sub.result)
[1] 7 19 8 16 4 22
[1] -1 -3 0 -6 -4 0
Vector Manipulation
# Vector Element Sorting
Elements in a vector can be sorted using the sort() function.
v <- c(3,8,4,5,0,11, -9, 304)
# Sort the elements of the vector.
sort.result <- sort(v)
print(sort.result)
# Sort the elements in the reverse order.
revsort.result <- sort(v, decreasing = TRUE)
print(revsort.result)
# Sorting character vectors.
v <- c("Red","Blue","yellow","violet")
sort.result <- sort(v)
print(sort.result)
Vector Manipulation
# Vector Element Sorting
# Sorting character vectors in reverse order.
revsort.result <- sort(v, decreasing = TRUE)
print(revsort.result)

[1] -9 0 3 4 5 8 11 304
[1] 304 11 8 5 4 3 0 -9
[1] "Blue" "Red" "violet" "yellow"
[1] "yellow" "violet" "Red" "Blue"
Explain about the list?

Lists are the R objects which contain elements of different types like − numbers, strings, vectors
and another list inside it. A list can also contain a matrix or a function as its elements. List is
created using list() function.
Creating a List
Following is an example to create a list containing strings, numbers, vectors and a logical values
# Create a list containing strings, numbers, vectors and a logical
# values.
list_data <- list("Red", "Green", c(21,32,11), TRUE, 51.23, 119.1)
print(list_data) .
Explain about the list?

[[1]]
[1] "Red"
[[2]]
[1] "Green"
[[3]]
[1] 21 32 11
[[4]]
[1] TRUE
[[5]]
[1] 51.23
[[6]]
[1] 119.1
Naming List Elements
The list elements can be given names and they can be accessed using these names.
# Create a list containing a vector, a matrix and a list.
list_data <- list(c("Jan","Feb","Mar"), matrix(c(3,9,5,1,-2,8), nrow = 2), list("green",12.3))
# Give names to the elements in the list.
names(list_data) <- c("1st Quarter", "A_Matrix", "A Inner list")
# Show the list.
print(list_data)
Naming List Elements

•$`1st_Quarter`
•[1] "Jan" "Feb" "Mar"
•$A_Matrix
• [,1] [,2] [,3]
•[1,] 3 5 -2
•[2,] 9 1 8
•$A_Inner_list
•$A_Inner_list[[1]]
•[1] "green"
•$A_Inner_list[[2]]
•[1] 12.3
Accessing List Elements

Elements of the list can be accessed by the index of the element in the list. In case of named
lists it can also be accessed using the names.
We continue to use the list in the above example −
list_data <- list(c("Jan","Feb","Mar"), matrix(c(3,9,5,1,-2,8), nrow = 2),
list("green",12.3))
# Access the first element of the list.
print(list_data[1])
# Access the third element. As it is also a list, all its elements will be printed.
print(list_data[3])
# Access the list element using the name of the element.
print(list_data$A_Matrix)
Accessing List Elements

$`1st_Quarter`
[1] "Jan" "Feb" "Mar"
$A_Inner_list
$A_Inner_list[[1]]
[1] "green"
$A_Inner_list[[2]]
[1] 12.3
[,1] [,2] [,3]
[1,] 3 5 -2
[2,] 9 1 8
Manipulating List Elements

We can add, delete and update list elements as shown below. We can add and delete elements
only at the end of a list. But we can update any element.
list_data <- list(c("Jan","Feb","Mar"), matrix(c(3,9,5,1,-2,8), nrow = 2),
list("green",12.3))
# Add an element at the end of the list.
list_data[4] <- "New element"
print(list_data[4])
# Remove the last element.
list_data[4] <- NULL
# Print the 4th Element.
print(list_data[4])

# Update the 3rd Element.
list_data[3] <- "updated element"
print(list_data[3])
[[1]]
[1] "New element"
$<NA>
NULL
$`A Inner list`
[1] "updated element"

Merging Lists
You can merge many lists into one list by placing all the lists inside one list() function.
# Create two lists.
list1 <- list(1,2,3)
list2 <- list("Sun","Mon","Tue")
# Merge the two lists.
merged.list <- c(list1,list2)
# Print the merged list.
print(merged.list)

•Merging Lists
[[1]]
[1] 1
[[2]]
[1] 2
[[3]]
[1] 3
[[4]]
[1] "Sun"
[[5]]
[1] "Mon"
[[6]]
[1] "Tue"
Converting List to Vector
A list can be converted to a vector so that the elements of the vector can be used for further
manipulation. All the arithmetic operations on vectors can be applied after the list is converted
into vectors. To do this conversion, we use the unlist() function. It takes the list as input and
produces a vector.
# Create lists.
list1 <- list(1:5)
print(list1)
list2 <-list(10:14)
print(list2)
# Convert the lists to vectors.
v1 <- unlist(list1)
v2 <- unlist(list2)
Converting List to Vector

print(v1)
print(v2)
# Now add the vectors
result <- v1+v2
print(result)
[[1]]
[1] 1 2 3 4 5
[[1]]
[1] 10 11 12 13 14
[1] 1 2 3 4 5
[1] 10 11 12 13 14
[1] 11 13 15 17 19
Matrices
Matrices are the R objects in which the elements are arranged in a two-dimensional rectangular
layout. They contain elements of the same atomic types. Though we can create a matrix
containing only characters or only logical values, they are not of much use. We use matrices
containing numeric elements to be used in mathematical calculations.
A Matrix is created using the matrix() function.
Syntax
The basic syntax for creating a matrix in R is −
matrix(data, nrow, ncol, byrow, dimnames)
Data is the input vector which becomes the data elements of the matrix.
nrow is the number of rows to be created.
ncol is the number of columns to be created.
byrow is a logical clue. If TRUE then the input vector elements are arranged by row.
dimname is the names assigned to the rows and columns.
Matrices
Create a matrix taking a vector of numbers as input.
# Elements are arranged sequentially by row.
M <- matrix(c(3:14), nrow = 4, byrow = TRUE)
print(M)
# Elements are arranged sequentially by column.
N <- matrix(c(3:14), nrow = 4, byrow = FALSE)
print(N)
# Define the column and row names.
rownames = c("row1", "row2", "row3", "row4")
colnames = c("col1", "col2", "col3")
P <- matrix(c(3:14), nrow = 4, byrow = TRUE, dimnames = list(rownames, colnames))
print(P)
Matrices
[,1] [,2] [,3]
[1,] 3 4 5
[2,] 6 7 8
[3,] 9 10 11
[4,] 12 13 14
[,1] [,2] [,3]
[1,] 3 7 11
[2,] 4 8 12
[3,] 5 9 13
[4,] 6 10 14
col1 col2 col3
row1 3 4 5
row2 6 7 8
row3 9 10 11
row4 12 13 14
Matrices
Accessing Elements of a Matrix
Elements of a matrix can be accessed by using the column and row index of the element. We
consider the matrix P above to find the specific elements below.
# Define the column and row names.
rownames = c("row1", "row2", "row3", "row4")
colnames = c("col1", "col2", "col3")
# Create the matrix.
P <- matrix(c(3:14), nrow = 4, byrow = TRUE, dimnames = list(rownames, colnames))
# Access the element at 3rd column and 1st row.
print(P[1,3])
# Access the element at 2nd column and 4th row.
print(P[4,2])
# Access only the 2nd row.
print(P[2,])
Matrices
Accessing Elements of a Matrix
# Access only the 3rd column.
print(P[,3])
[1] 5
[1] 13
col1 col2 col3
6 7 8
row1 row2 row3 row4
5 8 11 14
Matrices
Matrix Computations
Various mathematical operations are performed on the matrices using the R operators. The result
of the operation is also a matrix.
The dimensions (number of rows and columns) should be the same for the matrices involved in
the operation.
Matrix Addition & Subtraction
# Create two 2x3 matrices.
matrix1 <- matrix(c(3, 9, -1, 4, 2, 6), nrow = 2)
print(matrix1)
matrix2 <- matrix(c(5, 2, 0, 9, 3, 4), nrow = 2)
print(matrix2)
Matrices
Matrix Computations
# Add the matrices.
result <- matrix1 + matrix2
cat("Result of addition","\n")
print(result)
# Subtract the matrices
result <- matrix1 - matrix2
cat("Result of subtraction","\n")
print(result)
Matrices
Matrix Computations
[,1] [,2] [,3]
[1,] 3 -1 2
[2,] 9 4 6
[,1] [,2] [,3]
[1,] 5 0 3
[2,] 2 9 4
Result of addition
[,1] [,2] [,3]
[1,] 8 -1 5
[2,] 11 13 10
Result of subtraction
[,1] [,2] [,3]
[1,] -2 -1 -1
[2,] 7 -5 2
Matrices
Matrix Multiplication & Division
# Create two 2x3 matrices.
matrix1 <- matrix(c(3, 9, -1, 4, 2, 6), nrow = 2)
print(matrix1)
matrix2 <- matrix(c(5, 2, 0, 9, 3, 4), nrow = 2)
print(matrix2)
# Multiply the matrices.
result <- matrix1 * matrix2
cat("Result of multiplication","\n")
print(result)
# Divide the matrices
result <- matrix1 / matrix2
cat("Result of division","\n")
print(result)
Matrices
[,1] [,2] [,3]
[1,] 3 -1 2
[2,] 9 4 6
[,1] [,2] [,3]
[1,] 5 0 3
[2,] 2 9 4
Result of multiplication
[,1] [,2] [,3]
[1,] 15 0 6
[2,] 18 36 24
Result of division
[,1] [,2] [,3]
[1,] 0.6 -Inf 0.6666667
[2,] 4.5 0.4444444 1.5000000
Data Frames
A data frame is a table or a two-dimensional array-like structure in which each column contains
values of one variable and each row contains one set of values from each column.
Following are the characteristics of a data frame.
•The column names should be non-empty.
•The row names should be unique.
•The data stored in a data frame can be of numeric, factor or character type.
•Each column should contain the same number of data items.

Data Frames
# Create the data frame.
emp.data <- data.frame(
emp_id = c (1:5),
emp_name = c("Rick","Dan","Michelle","Ryan","Gary"),
salary = c(623.3,515.2,611.0,729.0,843.25),
start_date = as.Date(c("2012-01-01", "2013-09-23", "2014-11-15", "2014-05-11",
"2015-03-27")),
stringsAsFactors = FALSE
)
# Print the data frame.
print(emp.data)
Data Frames
emp_id emp_name salary start_date
1 1 Rick 623.30 2012-01-01
2 2 Dan 515.20 2013-09-23
3 3 Michelle 611.00 2014-11-15
4 4 Ryan 729.00 2014-05-11
5 5 Gary 843.25 2015-03-27
Data Frames
Get the Structure of the Data Frame
The structure of the data frame can be seen by using str() function.
emp_id = c (1:5),
salary = c(623.3,515.2,611.0,729.0,843.25),
"2015-03-27")),
)
# Get the structure of the data frame.
str(emp.data)
Data Frames
Summary of Data in Data Frame
The statistical summary and nature of the data can be obtained by applying summary()
function.
emp_id = c (1:5),
salary = c(623.3,515.2,611.0,729.0,843.25),
"2015-03-27")),
)
# Print the summary.
print(summary(emp.data))
Data Frames
When we execute the above code, it produces the following result

Min. :1 Length:5 Min. :515.2 Min. :2012-01-01
1st Qu.:2 Class :character 1st Qu.:611.0 1st Qu.:2013-09-23
Median :3 Mode :character Median :623.3 Median :2014-05-11
Mean :3 Mean :664.4 Mean :2014-01-14
3rd Qu.:4 3rd Qu.:729.0 3rd Qu.:2014-11-15
Max. :5 Max. :843.2 Max. :2015-03-27
Data Frames
Extract Data from Data Frame
Extract specific columns from a data frame using column name.
emp_id = c (1:5),
salary = c(623.3,515.2,611.0,729.0,843.25),
start_date = as.Date(c("2012-01-01","2013-09-23","2014-11-15","2014-05-11",
"2015-03-27")),
)
# Extract Specific columns.
result <- data.frame(emp.data$emp_name,emp.data$salary)
print(result)
Data Frames
emp.data.emp_name emp.data.salary
1 Rick 623.30
2 Dan 515.20
3 Michelle 611.00
4 Ryan 729.00
5 Gary 843.25
Data Frames
Extract the first two rows and then all columns
Create the data frame.
emp_id = c (1:5),
salary = c(623.3,515.2,611.0,729.0,843.25),
"2015-03-27")),
)
# Extract the first two rows.
result <- emp.data[1:2,]
print(result)
Data Frames

1 1 Rick 623.3 2012-01-01
2 2 Dan 515.2 2013-09-23
Data Frames
Extract 3rd and 5th row with 2nd and 4th column
emp_id = c (1:5),
salary = c(623.3,515.2,611.0,729.0,843.25),
"2015-03-27")),
)
# Extract 3rd and 5th row with 2nd and 4th column.
result <- emp.data[c(3,5),c(2,4)]
print(result)
Data Frames
emp_name start_date
3 Michelle 2014-11-15
5 Gary 2015-03-27
Data Frames
Expand Data Frame
A data frame can be expanded by adding columns and rows.
Add Column
Just add the column vector using a new column name.
emp_id = c (1:5),
salary = c(623.3,515.2,611.0,729.0,843.25),
"2015-03-27")),
)
Data Frames
# Add the "dept" column.
emp.data$dept <- c("IT","Operations","IT","HR","Finance")
v <- emp.data
print(v)

emp_id emp_name salary start_date dept
1 1 Rick 623.30 2012-01-01 IT
2 2 Dan 515.20 2013-09-23 Operations
3 3 Michelle 611.00 2014-11-15 IT
4 4 Ryan 729.00 2014-05-11 HR
5 5 Gary 843.25 2015-03-27 Finance
Data Frames
Add Row
To add more rows permanently to an existing data frame, we need to bring in the new rows in
the same structure as the existing data frame and use the rbind() function.
In the example below we create a data frame with new rows and merge it with the existing data
frame to create the final data frame.
# Create the first data frame.
emp_id = c (1:5),
salary = c(623.3,515.2,611.0,729.0,843.25),
"2015-03-27")),
dept = c("IT","Operations","IT","HR","Finance"),
)
Data Frames
# Create the second data frame
emp.newdata <- data.frame(
emp_id = c (6:8),
emp_name = c("Rasmi","Pranab","Tusar"),
salary = c(578.0,722.5,632.8),
start_date = as.Date(c("2013-05-21","2013-07-30","2014-06-17")),
dept = c("IT","Operations","Finance"),
)
# Bind the two data frames.
emp.finaldata <- rbind(emp.data,emp.newdata)
print(emp.finaldata)
Data Frames
emp_id emp_name salary start_date dept
1 1 Rick 623.30 2012-01-01 IT
2 2 Dan 515.20 2013-09-23 Operations
3 3 Michelle 611.00 2014-11-15 IT
4 4 Ryan 729.00 2014-05-11 HR
5 5 Gary 843.25 2015-03-27 Finance
6 6 Rasmi 578.00 2013-05-21 IT
7 7 Pranab 722.50 2013-07-30 Operations
8 8 Tusar 632.80 2014-06-17 Finance
Reshaping?
A. Data Reshaping in R is about changing the way data is organized into rows and columns. Most
of the time data processing in R is done by taking the input data as a data frame. It is easy to
extract data from the rows and columns of a data frame but there are situations when we need
the data frame in a format that is different from the format in which we received it. R has many
functions to split, merge and change the rows to columns and vice-versa in a data frame.
Joining Columns and Rows in a Data Frame
We can join multiple vectors to create a data frame using the cbind() function. Also we can
merge two data frames using rbind() function.
Reshaping?
# Create vector objects.
city <- c("Tampa","Seattle","Hartford","Denver")
state <- c("FL","WA","CT","CO")
zipcode <- c(33602,98104,06161,80294)
# Combine above three vectors into one data frame.
addresses <- cbind(city,state,zipcode)
# Print a header.
cat("# # # # The First data frame\n")
print(addresses)
# Create another data frame with similar columns
new.address <- data.frame(
city = c("Lowry","Charlotte"),
state = c("CO","FL"),
zipcode = c("80230","33949"),
)
Reshaping?
# Print a header.
cat("# # # The Second data frame\n")
print(new.address)
# Combine rows form both the data frames.
all.addresses <- rbind(addresses,new.address)
# Print a header.
cat("# # # The combined data frame\n")
# Print the result.
print(all.addresses) )
Reshaping?
# # # # The First data frame
city state zipcode
[1,] "Tampa" "FL" "33602"
[2,] "Seattle" "WA" "98104"
[3,] "Hartford" "CT" "6161"
[4,] "Denver" "CO" "80294"
# # # The Second data frame
city state zipcode
1 Lowry CO 80230
2 Charlotte FL 33949
Explain about Packages?
A. R packages are a collection of R functions, compiled code and sample data. They are stored
under a directory called "library" in the R environment. By default, R installs a set of packages
during installation. More packages are added later, when they are needed for some specific
purpose. When we start the R console, only the default packages are available by default. Other
packages which are already installed have to be loaded explicitly to be used by the R program
that is going to use them.
All the packages available in R language are listed at R Packages.
Below is a list of commands to be used to check, verify and use the R packages.
Check Available R Packages
Get library locations containing R packages
.libPaths()
When we execute the above code, it produces the following result. It may vary depending on the
local settings of your pc.
[2] "C:/Program Files/R/R-3.2.2/library"
Get the list of all the packages installed
library()
When we execute the above code, it produces the following result. It may vary depending on the
local settings of your pc.
Packages in library ‘C:/Program Files/R/R-3.2.2/library’:

base The R Base Package
boot Bootstrap Functions (Originally by Angelo Canty
for S)
class Functions for Classification
cluster "Finding Groups in Data": Cluster Analysis
Extended Rousseeuw et al.
codetools Code Analysis Tools for R
compiler The R Compiler Package
datasets The R Datasets Package
foreign Read Data Stored by 'Minitab', 'S', 'SAS',
'SPSS', 'Stata', 'Systat', 'Weka', 'dBase', ...
graphics The R Graphics Package
grDevices The R Graphics Devices and Support for Colours
and Fonts
grid The Grid Graphics Package
KernSmooth Functions for Kernel Smoothing Supporting Wand
& Jones (1995)
lattice Trellis Graphics for R
MASS Support Functions and Datasets for Venables and
Ripley's MASS
Matrix Sparse and Dense Matrix Classes and Methods
methods Formal Methods and Classes
mgcv Mixed GAM Computation Vehicle with GCV/AIC/REML
Smoothness Estimation
nlme Linear and Nonlinear Mixed Effects Models
nnet Feed-Forward Neural Networks and Multinomial
Log-Linear Models
parallel Support for Parallel computation in R
rpart Recursive Partitioning and Regression Trees
spatial Functions for Kriging and Point Pattern Analysis
splines Regression Spline Functions and Classes
stats The R Stats Package
stats4 Statistical Functions using S4 Classes
survival Survival Analysis
tcltk Tcl/Tk Interface
tools Tools for Package Development
utils The R Utils Package
Install a New Package
There are two ways to add new R packages. One is installing directly from the CRAN directory
and another is downloading the package to your local system and installing it manually.
Install directly from CRAN
The following command gets the packages directly from CRAN webpage and installs the package
in the R environment. You may be prompted to choose the nearest mirror. Choose the one
appropriate to your location.
install.packages("Package Name")
# Install the package named "XML".
install.packages("XML")
Explain about Packages? Hadoop & R Programmming
Install package manually

Go to the link R Packages to download the package needed. Save the package as a .zip file in a
suitable location in the local system.
Now you can run the following command to install this package in the R environment.
install.packages(file_name_with_path, repos = NULL, type = "source")
# Install the package named "XML"
install.packages("E:/XML_3.98-1.3.zip", repos = NULL, type = "source")
Load Package to Library
Before a package can be used in the code, it must be loaded to the current R environment. You
also need to load a package that is already installed previously but not available in the current
environment.
A package is loaded using the following command −
library("package Name", lib.loc = "path to library")
# Load the package named "XML"
install.packages("E:/XML_3.98-1.3.zip", repos = NULL, type = "source")

R Programming

Uploaded by

Copyright:

Available Formats

R Programming

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

R Programming

Uploaded by

Copyright:

Available Formats

Hadoop & R

Strings, Vectors, List, Arrays

Rules Applied in String Construction

Following examples clarify the rules about creating a string in R.

Following examples clarify the rules about creating a string in R.

[1] "6.000000e+00" "1.314521e+01"

[1] " 13.7"

[1] "Hello "

[1] " Hello "

These functions change the case of characters of a string.

The basic syntax for toupper() & tolower() function is −

Following is the description of the parameters used −

x is the vector input.

# Changing to Upper case.

result <- toupper("Changing To Upper")

# Changing to lower case.

result <- tolower("Changing To Lower")

[1] "changing to lower"

This function extracts parts of a String.

The basic syntax for substring() function is −

Following is the description of the parameters used −

x is the character vector input.

first is the position of the first character to be extracted.

last is the position of the last character to be extracted.

•# Extract characters from 5th to 7th position.

Single Element Vector

Using sequence (Seq.) operator

# Create a vector with elements from 5 to 9 incrementing by 0.4.

When we execute the above code, it produces the following result −

Using the c() function

# The logical and numeric values are converted to characters.

When we execute the above code, it produces the following result −

[1] "apple" "red" "5" "TRUE"

•Elements of a Vector are accessed using indexing.

•TRUE, FALSE or 0 and 1 can also be used for indexing.

When we execute the above code, it produces the following result −

When we execute the above code, it produces the following result −

Explain about the list?

Explain about the list?

Naming List Elements

# Create a list containing a vector, a matrix and a list.

list_data <- list(c("Jan","Feb","Mar"), matrix(c(3,9,5,1,-2,8), nrow = 2), list("green",12.3))

# Give names to the elements in the list.

names(list_data) <- c("1st Quarter", "A_Matrix", "A Inner list")

# Show the list.

Naming List Elements

Accessing List Elements

Accessing List Elements

Manipulating List Elements

Manipulating List Elements

Manipulating List Elements

Manipulating List Elements

Manipulating List Elements

Converting List to Vector

Manipulating List Elements

Converting List to Vector

Following are the characteristics of a data frame.

•The column names should be non-empty.

•The row names should be unique.