In the last week of class, we are going to complete a reflection activity. This discussion topic is to be reflective and will be using your own words and not a compilation of direct citations from other papers or sources. You can use citations in your posts, but this discussion exercise should be about what you have learned through your viewpoint and not a re-hash of any particular article, topic, or the book. Items to include in the initial thread:
- “Interesting Assignments” – What were some of the more interesting assignments to you?
- “Interesting Readings” – What reading or readings did you find the most interesting and why?
- “Interesting Readings”“Perspective” – How has this course changed your perspective?
- “Course Feedback” – What topics or activities would you add to the course, or should we focus on some areas more than others?
Regular Expression
Name
Institutional Affiliation
Regular Expression
Importance of Regular Expression in Data Analytics
The regular expression refers to a pattern that is normally used for the purpose of checking
a string of text of interest within a data stream (Emetere et al., 2020). The system that is used to
implement regular expression can check and return a given set of string of text and that is only if
it accurately matches a given expression pattern.
The Importance of Regular Expressions in Data Analytics
The importance of regular expression in data analytics is its potential to help a data analyst
in the manipulation and transformation of datasets (Emetere et al., 2020). For this particular
reason, a regular expression is used for the purpose of pattern matching in data analytics. In regards
to this, we can see that regular expression is a crucial process in data analysis since it provides the
data analyst with the capability to manipulate a data set for the aim of achieving certain goals.
Examples of regular expression includes asterisks(*), dollar sign($), square brackets([]), carat(^),
question mark(?), and much more.
The Differences Between The Major Types of Regular Expressions
For instance, the square bracket [] is used for the purpose of matching any one character in
a string. A regular expression that comes before by an asterisk is used for the purpose of matching
0 or more incidences of the string of text (Kumar et al., (2020). There are regular expressions that
are used for the purpose of matching simple expressions, grouping expressions such as double
parenthesis, matching any character such as a period, and matching the start and the termination
of a line.
The Difference Between Two Types of Regular Expressions
The two chosen expressions are asterisk and square bracket. A square bracket [] is used to
match any single character in a string (Kumar et al., (2020). It is also used for the purpose of
matching the start and the termination of a line. On the other hand, a regular expression that comes
before by an asterisk is used mainly to match 0 or more occurrences of a string of text. When
communicating in the form of texting, an asterisk can be used to show the correction of a word in
a previous text.
References
Emetere, M. E., & Akinlabi, E. T. (2020). Modeling Big Data and Further Analysis.
In Introduction to Environmental Data Analysis and Modeling (pp. 79-155). Springer,
Cham.
Kumar, Y., Sood, K., Kaul, S., & Vasuja, R. (2020). Big Data Analytics and Its Benefits
in Healthcare. In Big Data Analytics in Healthcare (pp. 3-21). Springer, Cham.
Learning objectives
Participants will be able to:
1. Understand different ways of summarizing data
2. Choose the right table/graph for the right data
and audience
3. Ensure that graphics are self-explanatory
4. Create graphs and tables that are attractive
Data Presentation, Interpretation
and Use
Do you present yourself like this?
So why would you present your
data like this?
Or this?
100
80
60
Any net
LLIN
40
20
0
This is Better!
Use of ITNs in Zambia
6
Effective presentation
•
•
•
•
Clear
Concise
Actionable
Attractive
Effective presentation
• For all communication formats it is important
to ensure that there is:
– Consistency
• Font, Colors, Punctuation, Terminology, Line/
Paragraph Spacing
– An appropriate amount of information
• Less is more
– Appropriate content and format for
audience
• Scientific community, Journalist, Politicians
Summarizing data
• Tables
– Simplest way to summarize data
– Data is presented as absolute numbers or
percentages
• Charts and graphs
– Visual representation of data
– Usually data is presented using percentages
Points to remember
• Ensure graphic has a title
• Label the components of your graphic
• Indicate source of data with date
• Provide number of observations (n=xx)
as a reference point
• Add footnote if more information is
needed
Tips for Presenting Data in PowerPoint
• All text should be readable
• Use sans serif fonts
– Gill Sans (sans serif)
– Times New Roman (serif)
•
•
•
•
Use graphs or charts, not tables
Keep slides simple
Limit animations and special effects
Use high contrast text and backgrounds
11
Choosing a Title
• A title should express
– Who
– What
– When
– Where
Tables: Frequency distribution
Year
2000
2001
2002
2003
2004
2005
2006
2007
Number of cases
4 216 531
3 262 931
3 319 339
5 338 008
7 545 541
9 181 224
8 926 058
9 610 691
Tables: Relative frequency
Percent contribution of reported malaria cases by year between 2000 and 2007, Kenya
Year
Number of malaria cases (n)
Relative frequency (%)
2000
4 216 531
8
2001
3 262 931
6
2002
3 319 339
7
2003
5 338 008
10
2004
7 545 541
15
2005
9 181 224
18
2006
8 926 058
17
2007
9 610 691
19
Total
51 400 323
100.0
Source: WHO, World Malaria Report 2009
Use the right type of graphic
• Charts and graphs
– Bar chart: comparisons, categories of data
– Histogram: represents relative frequency of
continuous data
– Line graph: display trends over time,
continuous data (ex. cases per month)
– Pie chart: show percentages or
proportional share
Bar chart
100
80
60
40
20
0
Any net
LLIN
Bar Chart
Household Ownership of at Least 1 Net or ITN, 2008
100
80
Percent
77
60
70
66
Any net
57
56
40
46
45
38
20
0
Country 1
Country 3
Source: Quarterly Country Summaries, 2008
Country 4
Country 5
LLIN
Stacked bar chart
% Children > var1 = 1
>>> var2 = 10
Python supports four different numerical types −
• int (signed integers)
• long (long integers, they can also be represented in octal and hexadecimal)
• float (floating point real values)
• complex (complex numbers)
Strings
Strings are objects – they have methods that return the result of a function on the string. Example – assign a string to a variable: “My Name”
using methods: “title”, “upper”, “isdigit” and “islower” – the last two return Boolean results
>>> My Name . title ()
My Name
>>> My Name . upper ()
MY NAME
>>> My Name . isdigit ()
False
>>> My Name . islower ()
True
Other Python objects
(strings)
The plus (+) sign is the string concatenation operator and the asterisk (*) is the repetition operator. For example −
>>> str = ‘Hello World!’
>>> print str # Prints complete string
Hello World!
>>> print str[0] # Prints first character of the string
H
>>> print str[2:5] # Prints characters starting from 3rd to 5th
llo
>>> print str[2:] # Prints string starting from 3rd character
l
>>> print str * 2 # Prints string two times
Hello World!Hello World!
>>> print str + “TEST” # Prints concatenated string
Hello World!TEST
Other Python objects
(lists)
values stored in a list can be accessed using the slice operator ([ ] and [:]) with indexes starting at 0 in the beginning of the list and
working their way to end -1.
>>> list = [‘abcd’, 786 , 2.23, ‘john’, 70.2]
>>> tinylist = [123, ‘john’]
>>> print list – # Prints complete list
[‘abcd’, 786, 2.23, ‘john’, 70.2]
>>> print list[0] – # Prints first element of the list
abcd
>>> print list[1:3] – # Prints elements starting from 2nd till 3rd
786, 2.23
>>> print list + tinylist – # Prints concatenated lists
[‘abcd’, 786, 2.23, ‘john’, 70.2, 123, ‘john’]
Other Python objects
(tuples)
The main differences between lists and tuples are: Lists are enclosed in brackets ( [ ] ) and their elements and size can be changed,
while tuples are enclosed in parentheses ( ( ) ) and cannot be updated. Tuples can be thought of as read-only lists. For example −.
>>> tuple = (‘abcd’, 786 , 2.23, ‘john’, 70.2)
>>> tinytuple = (123, ‘john’)
>>> print tuple – # Prints complete list
(‘abcd’, 786, 2.23, ‘john’, 70.2)
>>> print tuple[0] – # Prints first element of the list
abcd
>>> print tuple[1:3] – # Prints elements starting from 2nd till 3rd
(786, 2.23)
>>> print tinytuple * 2 – # Prints list two times
(123, ‘john’, 123, ‘john’)
>>> print tuple + tinytuple – # Prints concatenated lists
(‘abcd’, 786, 2.23, ‘john’, 70.2, 123, ‘john’)
Other Python objects
(dictionaries)
Dictionaries are enclosed by curly braces ({ }) and values can be assigned and accessed using square braces ([]). For example −
>>> dict = {}
>>> dict[‘one’] = “This is one”
>>> dict[2] = “This is two“
>>> tinydict = {‘name’: ‘john’,’code’:6734, ‘dept’: ‘sales’}
>>> print dict[‘one’] # Prints value for ‘one’ key
This is one
>>> print dict[2] – # Prints value for 2 key
This is two
>>> print tinydict – # Prints complete dictionary
{‘dept’: ‘sales’, ‘code’: 6734, ‘name’: ‘john’}
>>> print tinydict.keys() – # Prints all the keys
[‘dept’, ‘code’, ‘name’]
>>> print tinydict.values() – # Prints all the values
[‘sales’, 6734, ‘john’]
Let’s look at some practical uses
Python – Using
the Tool
(part III)
•
•
•
•
Output a message to console
Arithmetic Operations
Create a user-defined Function
Create a Class Object
Output a
String
>>> print (‘Welcome to the ITS530 course at UC!’)
Welcome to the ITS530 course at UC!
>>> a = 21
>>> b = 10
>>> c = 0
Arithmetic
Operations
>>> a + b # add a and b
31
>>>a – b # subtract b from a
11
>>> a * b # multiply a times b
210
>>> a / b # a divided by b
2.1
>>> a ** b # a raised to the power of b
1667988097
>>> a % b # a modulus b
1
def greet(name): # creates a function with name as argument
User-defined
Function
print (‘Hello’, name)
greet(‘Jack’)
greet(‘Jill’)
greet(‘Bob’)
>>> class BankAccount(object): # Defines a class object (BankAccount)
>>> def __init__(self, initial_balance=0): # Create an initial array set with 3 values
Define a
Class Object
>>> self.balance = initial_balance # Set self.balance to initial balance
>>> def deposit(self, amount): # Create a deposit array set with 2 values
>>> self.balance += amount # Add amount deposited to self.balance
>>> def withdraw(self, amount): # Create a withdrawal array set with 2 values
>>> self.balance -= amount # Subtract amount withdrawal from self.balance
>>> def overdrawn(self): # Create an array for overdrawn, set with 1 value
>>> return self.balance < 0
>>> my_account = BankAccount(15)
>>> my_account.withdraw(50)
>>> print (my_account.balance, my_account.overdrawn())