Built-in Functions
Why Learn Built-in Functions in Data Analytics?
Imagine you are analyzing product reviews for an e-commerce website. You have a customer rating stored as a text string: "4.8". Additionally, another review has a typo storing the rating as a negative number: -3.5 (which should be positive 3.5).
To use these ratings in your analytics model, you need to:
- Convert the text
"4.8"into a real decimal number. - Convert the negative
-3.5to its absolute value3.5. - Round a long average rating like
4.166667to a clean4.2for display.
Instead of writing complex math formulas to achieve this, R provides pre-built solutions called built-in functions. Knowing these functions allows you to clean and format data in seconds.
1. What is a Function?
A function is a reusable block of code that takes one or more inputs (arguments), processes them, and returns an output.
In the previous chapter, you used print() and class(). These are built-in functions:
class(4.5): The argument is4.5, and the function returns the character string"numeric".
2. R Built-in Data Type Conversions
Since data importing often reads values as text (character class), you must frequently convert them to numbers or logials. In R, these functions are prefixed with as.:
as.numeric(): Converts to decimal/numeric.as.integer(): Converts to integer.as.character(): Converts to character text.as.logical(): Converts to logical (TRUE/FALSE).
# Data cleaning example
raw_rating <- "4.8"
clean_rating <- as.numeric(raw_rating)
print(class(raw_rating)) # "character"
print(class(clean_rating)) # "numeric"
print(clean_rating * 2) # 9.6 (Now mathematical calculations work!)
3. Handy Mathematical Built-in Functions
R is built for statistics, so it includes outstanding math functions out-of-the-box:
abs(x): Returns the absolute value ofx.sqrt(x): Returns the square root ofx.round(x, digits): Roundsxto a specified number of decimal places.ceiling(x): Roundsxup to the nearest integer.floor(x): Roundsxdown to the nearest integer.
# Cleaning a negative rating and rounding
bad_entry <- -3.567
clean_entry <- abs(bad_entry)
rounded_entry <- round(clean_entry, digits = 1)
print(clean_entry) # 3.567
print(rounded_entry) # 3.6
4. Getting User Input in R: readline()
To pause a script and receive input from the user, R uses the readline() function. You can pass a prompt message inside the function.
# Get user input
user_name <- readline(prompt = "Enter your name: ")
print(paste("Hello", user_name))
Important: Just like Python's input(), R's readline() always returns the input as a character string, even if the user types a number. If you need a number, you must wrap it in as.numeric() or as.integer():
# Chaining readline and conversion
user_age <- as.integer(readline(prompt = "Enter your age: "))
print(user_age + 1) # Will successfully add 1
Hands-on Exercises
Exercise 1: Cleanup and Compute
You receive a sensor temperature value as a text string: "102.73". Write R code to:
- Store
"102.73"in a variable. - Convert it to a numeric data type.
- Calculate the square root of the temperature.
- Round the square root to 2 decimal places and print it.
# Write your code below and click Run Code
Click to view Answer
temp_str <- "102.73"
temp_num <- as.numeric(temp_str)
temp_sqrt <- sqrt(temp_num)
temp_rounded <- round(temp_sqrt, digits = 2)
print(temp_rounded)
Exercise 2: Absolute Growth Target
A company had a sales change metric of -12.4%.
Write R code to:
- Store
-12.4in a variable. - Convert it to its absolute value to represent the absolute magnitude of change.
- Ceiling the result to the next whole percentage integer.
- Print the final absolute target change.
# Write your code below and click Run Code
Click to view Answer
sales_change <- -12.4
absolute_magnitude <- abs(sales_change)
target_change <- ceiling(absolute_magnitude)
print(target_change) # Output: 13