I’m looking for an A quality answer. The due is 21st at 4pm (UTC-08:00 pacific time us&canada)
here is the instruction for this assignment.
In C++
Improve my multithreading example as follows:
(for both, reader and writer):
1. While not available do:
Enter the sleep mode
Wake and check the resources
End:
2. Acquire the lock on the resourece
3. Check if still available (It may be taken from you in the process between you checked for the resources and tried to set the lock
Think what you do if that becomes inavailable.
4. Read/write the file
5. Do not forget to properly clean up after yourself: In my code I did not close the file; that is an error, esp in this app.
6. Release the lock
7. Done with the thread.
Main should wait for both threads to complete, and output some status message.
Vladimir Sverdlov. CS222. Addendum to lecture 15. Multithreading.
Summary.
I was not going to talk about multithreading. It is not in the course’s catalog, nor in the syllabus; I
specifically indicated that it is not in our scope and that writing a multithreading application is well
beyond this advanced class. So I was not going to talk about that.
However, I received a request to do so. Initially my first move was to deny that request. Writing a
multithreading app is involved. You probably saw a lot of examples confirming what I just said… Even
with the professional software. Writing a good and robust multithreading app is involved and If we’d do
that, it would probably be full good two, three weeks or maybe more for a simple program.
But then one thing occurred to me.
I will talk about that, to underscore once again some of the rules and practices that I so adamantly had
been pushing upon you for the entire semester, and which you so persistently resisted to.
If you search on the web you will see a lot of examples how to write the multithreading apps. You will
see – how easy it is in the new (since C++-11) C++. Lucky you, if you will have to do that, you most
likely would not have to do it with the POSIX library. Lucky you!.. Anyways, there are a lot of
examples out there. What I will do here, I will take almost what you could find online, or books … and
extend it … just a little. I will connect it to the techniques and approaches of writing the code
that you consistently used in your code, and which, I am sure, you’d still continue doing so,
had not I’ve been so insisting.
I once received a concern, that the first half of the semester was … slow, while the rest was brutal. But…
how could we move forward if submissions after the submissions I still see the same problems… again and
again and only the midterm seemingly brought some changes in mind? You cannot write a … program, if
you cannot write a … program.
Anyways. I will write a code purposefully showing to you the danger and vulnerability of using global
variables, defines, and the practice to not checking for the resources, which some of you do even now.
One thing I will not include in this list – is to neglect to do the housekeeping. The reason for that – I do
not want to restart my computer after running that code.
With that introduction, let’s discuss some concepts.
But first:
#define multithreading application MTA
As with everything else, you do not write the MTA just because you want to and just because you can.
You need to have a good justification for that.
1. Suppose you write a heavy computational intensive software. If you can split the computations
in parts, and if you can assemble the results of those individual computations into the whole
unit afterwards, that would be a good candidate for the MTA.
2. Suppose, you have a software dealing with heavy and lot of data. If you can split the data into
the smaller chunks, and if you can re-constitute the final result afterwards (divide and conquer
strategy, should be familiar for people who took the algorithms), that would be a good candidate
for the MTA.
1. As a part of the ##1 and 2 above, suppose you have a lot of input-output. For example, you
read from / write to a database a lot of data. If your database stores data in different
partitions, you may consider dedicating separate thread for each partition.
3. Suppose you do heavy computations and you need to update the GUI in a real time… That one
too. That one is tricky to do correctly such that your GUI does not freeze, hung, crash, becomes
non-responsive or erratic etc. And because of GUI, the errors and mistakes here are especially
and visually seen.
4. Server side applications. The best and classical example – the web server. People probably
know how the web server works. There is a main thread (daemon) that listens to the incoming
connection, and when one comes, it dispatches a new thread to serve that request, while keeping
listening to the other connections.
Let me pause here for a moment. In the last example, a new thread is spawned every time a new request
comes. That’s one the reason why the DOS attacks are so hard to withstand. The problem is not only
that the network resources are consumed; the other and a big problem is that the new threads are used.
Hardware, however powerful it can be, still has limits. OS, however robust they can be, still have
limits. Number of the threads in the system, as with any other systems’ resource is finite. There is only
so many threads available. People who took my Linux class can probably recall the example how easy
and how fast I could exhaust the threads resources, after which the system essentially stops
responding…
I talk about that not because I just wanted to, but because I wanted to continue discussion going to this:
3. Some considerations when you decide whether you want to /
need to do a MTA
So, you run the application, and you decided, that you want to split some of the tasks between two (or
more) threads. Let’s consider for now that you run everything on one single machine. Suppose, it was a
dedicated machine, that was used to run your application only. That means, that (aside from the OS
overhead), all available resources were dedicated to your process. Now you added one more process to
that. That means, you take some share of the CPU, memory, storage, networking, file handlers… etc
from your first thread. That means its productivity per unit of time will diminish because there is
another process that you will need to share the resources with. If both of these threads use a lot of
memory allocation/deallocation, memory performance may degrade faster, and when it comes to the
certain point, the OS will need to re-align the memory, which considerably slows all the processes
running on the system. In short, when you spawn a new thread, you will not get a 100% increase in
productivity. In reality that will be less. In worst cases you may even not see that much of the gain as
you hoped, but you heavily invested into writing the MTA already… Bummer.
To summarize, as with everything else, you need to carefully evaluate and assess, whether the MTA is
what you really need.
Now, I made a comment about running all threads on one single machine. But you may not be
constrained to that. You may have array of machines available. Or you may have multiple cores. If you
can utilize that infrastructure and make your threads to take the dedicated machine or the core, that is
where you really unleash the processing power.
Some years ago there was a community project to track the near Earth dangerous space objects. If I
remember correctly, that was run and managed by some of the university’s Observatory in Australia
(cannot recall the names). The setup was that anyone could sign up and while your computer is idle, its
resources were used to run parts of these computations. That is a good example of distributed
computing. But again, that is not in the scope of this course, and I need to keep the format and the size
of the lecture.
1. So, you started your program, and at some point you spawn a new thread. You need to
understand, that everything before the bifurcation point is shared and available to that new
thread. Everything after the point of bifurcation may be the thread-specific, but what’s before is
fully inherited into the thread, including env variables. Thus, if you have global variables (those
not in the particular scope, but in the scope of the entire program), they will be available and
accessible to all the threads. That can be a good thing – people often use that property to
control the threads, or to send communications to a thread, or do a inter-threads
communications using the global variables. But however easy and simple this may seem, that is
a potential security flaw or source of very hard to figure out bugs.
2. Alright, you still need to share some resources (files, variables….), what to do?
When you do an MTA, which is a little more involved than the examples you may see online, you will
need to use a lot of care of correctly executing (schedule execution) of some parts of your code. You
also need to use a lot of care updating/accessing some of the variables, resources etc. The simplest (and
again, probably the most ubiquitous) example: You update the record in the database. You do not want
anyone to access (read or even more so write access) of that record until you finish writing to that
record. So, when you do an MTA, you need to know and correctly execute two things:
a. A critical section of your code. That is the part that should be executed in one block, without
interruption or interference. When you come to the critical section of the code, you lock the
execution (for example by setting a semaphore), and when you leave that section, you unlock
(release) the block.
b. A concept of atomic operation. Say, you want to update some variable. For example, you
want to increment the counter. You want to do it in such a way, that no other thread would be
able to access that variable during that operation either for reading or for writing.
3. You need to know how to schedule your threads. Sometime you want the execution to be in
certain order or sequence or otherwise synchronized.
4. You need to understand, that unless you take a special care (and thus, diminishing the advantage
of the multiple threads running in parallel), the output may (and will be) interlaced and
completely out of order. So if you look at the log, for example, you may have hard time to trace
everything down.
5. Finally, testing of the MTA is an involved task by itself.
Alright. Enough with the boring theories, let’s see some code.
If you search online, you will see that with the new thread library you can make a threads by number
of different options:
1. By passing a pointer to the function. We discussed that approach and we used that very similar
approach when we passed a sorting predicate to the STL’s sorting functions. Because of that
let’s not use it here.
2. By passing a functor. That’s interesting. Instead of passing just a function to the threads, you
pass the entire object. That gives you more power. We talked about functors in one of the
previous lectures, and in the last lecture too. That is also more involved than the simplest
examples. Let’s use that.
3. By passing a lambda-expressions. That you will see in the examples too. Here I will allow
myself to express a very strong opinion, which is just a my personal opinion and may or may
not be shared by many other people. Which is this:
You do not use tools just because you can. If you want to use a lambda-expressions to work in the
thread… Well, perhaps you can just use an asynchronous function instead of full-grade threads.
So, we will create an object and pass that object to the thread. I will write a very simple MTA with two
threads, one writing to the file, and one reading from the file. That may probably already raise a
suspicion, that my threads will compete for the same resource.
YES.
And that’s exactly what I want to show. I will use global variables and lack of the checks – just as most
if not all of you did in your codes earlier in the semester. And I will do it to purposefully cause the
racing condition.
Racing condition is when more than one thread competes for the same resources.
The racing condition may result in:
1. Denial that resource to other threads
2. Deadlock. Program enters a deadlock and cannot exit from it.
3. Inconsistent, incorrect, un-predictable data or output.
4. It can just crash.
Out of these four, I don’t know which one is worse. But I can probably say, that none is good.
So I’ll lead the threads into the racing conditions and we’ll see what will happen.
But before, the usual disclaimer… you know that, it is just my code etc. You also know the request not
to share it or make publicly available and to respect that my request. But in addition to that, if you
decide not to honor that my request, at least supply it along with the above complete description,
because I don’t want any established programmer looking at the code and asking: “Who ever wrote this
thing?”
Alright. The code. First version. Global variables, no check for the resources. All the comments –
inline.
#include
#include
#include
#include
#include “mingw.thread.h” //Your include may be different, probably
using namespace std;
enum MODE {INVALID = -1, READ, WRITE
};
//Global vars. Accessible to all threads. No protection whatsoever.
fstream myFile;
//That one is even worse. That completely removes compiler’s
//protection and type checking.
#define FILENAME “mytextfile.txt”
//////////// Classes declarations ////////////////////
/*******************************
* The underlying file functionality. Opens file for read/write access
* Only available to the derived classes on which it depends to correctly
* set the mode
*/
class myFileOp
{
protected:
myFileOp(MODE);
~myFileOp(){
}
void operator()(const string &);
MODE Mode;
private:
myFileOp(){Mode = INVALID;}
};
/*******************************
* File reader. Inherits file open functionality from its base class.
* Passes the READ mode to the base class. Overrides operator() for the functor
*/
class myFileReader : public myFileOp
{
public:
myFileReader();
~myFileReader(){}
void operator()(const string &) ;
};
/*******************************
* File writer. Inherits file open functionality from its base class.
* Passes the WRITE mode to the base class. Overrides operator() for the functor
*/
class myFileWriter : public myFileOp
{
public:
myFileWriter();
~myFileWriter(){}
void operator()(const string &) ;
};
///////////////////////// Implementation /////////////////////
////////////////////// Base class: myFileOp //////////////////////
/*******************************
* Constructor. I don’t say I critically depend on the correct mode’s setting…
* I *CRITICALLY* critically depend on the correct mode’s setting.
* >>>>>>>>>>> VERIFY!!!! <<<<<<<<<<<<<<<<<<<
*/
myFileOp::myFileOp(MODE m)
{
Mode = (m == READ || m == WRITE ? m : INVALID);
}
/*******************************
* File open functionality. Implemented as a overloaded functor operator.
* This is overriden and shadowed in the derived classes, but functionality
* is still available to them.
* Note, that I use a fstream object, declared globally.
* You know, that is not a good practice, right?
*/
void myFileOp::operator()(const string & fname )
{
if(Mode == INVALID)
return;
myFile.open(fname.c_str(), (Mode == READ ?
std::fstream::in : std::fstream::out));
}
//////////////////////////// end of myFileOp //////////////////////////////
//////////////////////////// myFileReaded //////////////////////////////////
/*******************************
* Constructor. The only job is to set the mode correctly.
* This is done without client’s involvement
*/
myFileReader::myFileReader() : myFileOp(READ)
{
}
/*******************************
* Here is where all the job to read the file is done. Note, that I break out of that
* if number of reads exceeds certain value. Also outputs number of reads
*/
void myFileReader::operator()(const string & fname )
{
myFileOp::operator ()(fname); //Calling base’s func to open the file getline(myFile, line); std::cerr<<"\nFrom fire reader: Data overflow; returning.";
return;
} std::cout<<"\nDone reading file with counter: >>>”< } //////////////////////////// end of myFileReader //////////////////////////////
//////////////////////////// myFileWriter //////////////////////////////////
/******************************* /******************************* void myFileWriter::operator()(const string & fname ) myFileOp::operator ()(fname); //Calling base’s func to open the file stringstream sChars; //I just wanted to show the stringstream obj. for(int j = 0; j < 50; j++)
{
sChars << char('a' + rand() % 26); //50 random chars
}
myFile< } int main() myFileReader fr; //spawning a reader thread, by passing a FileReader object in the thread read_thr(fr, FILENAME); thread write_thr(fw, FILENAME);
//You acquire resources, you release resources. In that case I write_thr.join(); } OK, let’s run it. We will run it few times, noting number of reads from the file. I will also truncate most Note, BTW, how the output is interlaced. Very hard to make sense out of it.
Second run Number of file reads is 1:
So we already see the problem. We have inconsistent number of reads, and also… It seems that the file Also, I’d like to point to your attention, that it was a file reader that was denied an access, even though, we We need to fix that. We can fix that by number of the options. Probably the correct one would be to But. It’s just becoming too complex for the very introductory example. You already have 30+ pages of void myFileOp::operator()(const string & fname ) if(!myFile) } myFileOp::operator ()(fname); {
cerr<<"\nFile Reader: Cannot get file handle, returning.";
return;
} void myFileWriter::operator()(const string & fname ) { } Here we test for the file in the base class, and if that is not available, we set the mode to the INVALID. Again, few times. First time:
Error is triggered in the file reader.
Second time:
And this time we see that the error check triggered bailing out in the writer object, and we see program As I used to say in my live classes:
– Any questions?
Do you see at least one other reason (beyond writing a good, structured code, which is a pleasure to Alright,
That’s it. And I see, that I finished that at the bottom of the page. Again. And you say something about © Vladimir Sverdlov, 2020 4. What you need to know before you start writing an MTA
string line;
int count = 0;
getline(myFile, line);
std::cout<<"\n1: From file reader:\n\t"<
std::cout<<"\n"<
{
}
* Constructor. The only job is to set the mode correctly.
* This is done without client’s involvement
*/
myFileWriter::myFileWriter() : myFileOp(WRITE)
{
}
* Here is where all the job to write to the file is done.
* Note, that I write a random strings to the file.
*/
{
for(int i = 0; i < 100; i++)
{
//A *very* useful tool. I recommend to take a look at it
//and to its many various useful functions.
}
{
myFileWriter fw;
//form of functor (overloaded () operator), and file name as a param.
//technically, since I have a file name defined globally, I could
//just use it in the thread, just like I did the file handle,
//but I wanted to show how the parameters can be passed to the thread.
//spawning a writer thread
//wait for the thread to finish, and collect its status. If I don’t
//do that… welcome to the world of orphans and zombies…
//joining the threads is also a way to synchronize the threads execution.
read_thr.join();
cout<
of the 100 lines of file writings: First run. Number of file reads is zero.
reader is denied the access to the file, but since we never bothered to check, we proceeded with the
execution of the program, assuming everything is nice and good. Do you think your clients would like
that software? Something tells me: “not really”.
launched that thread before the file writer.
define a critical section, and lock the file reading/writing operations. In that case if reader detects that
the resource is locked, it enters the sleep mode, periodically waking up and checking for the availability
of the resources. When resource becomes available, it would rush to lock it for itself, read the file, not
forgetting to unlock it back, and get hold off that file. That would be right chain of events for our MTA.
the main lecture to read. Because of that, let’s do that in a way, how we did it in our programs. We will
check for the successful file operation, and return (not entering the sleep-wait mode, but just return) if
the file is not available. I’ll show just the relevant parts of the code:
{
myFile.open(fname.c_str(), (Mode == READ ?
std::fstream::in : std::fstream::out));
Mode = INVALID;
void myFileReader::operator()(const string & fname )
{
if(Mode == INVALID)
//continue with the function
}
{
myFileOp::operator ()(fname);
if(Mode == INVALID)
cerr<<"\nFile Writer: Cannot get file handle, returning.";
return;
//continue with the function
}
Then in the derived classes (reader and writer) we check for the mode’s status, and if that is INVALID,
just return from the function. Let’s see how it runs now.
terminated properly.
look at) why you want to write a code that follows the best practices?
coincidence?
1. Introduction
2. When to write a multithreading app?
5. The code.