O
Synchronization of I/OThe first thing you have to do when facing a problem of this type of c++ webs is to disincronize C++ IO. We all know that the compatibility between C and C++ is quite high. Well, for this reason and due to historical reasons C++ is configured, by default, with the OWs of C and C++ synchronized.What does this imply? Well, basically, reading a data with cin is several times slower than doing it with scanf. The theory tells us that it should not be like this, at the end and beyond:With cin a function that knows what to read (an integer, an integer) is directly invoked. char, a chain ...)scanf you should interpret a text string to determine what you have to read at every moment, and then read what you have been asked for.Forcing both mechanisms to be synchronized plays against cin, since you have to pay a toll that increases (for what is required in this type of program) enormously the execution time.The best thing is to disincronize the IO to avoid paying unnecessary tolls:std::ios::sync_with_stdio(false);
After this call it will be convenient not to mix calls to stdin and cin (for example), since the result will not be predictable. In return, the calls to cin They'll be faster than those of scanf.Sort the dataIf you look at it, the grace of this problem is that it performs operations by columns. The ideal then would be to create the matrix by columns instead of by rows.And this why?You are defining an array of n*n in the program stack:int arr[n][n];
This structure is known as VLA, or Variable Length Array. It is not a feature supported by the standard and it is convenient not to use it. However, in this type of program the standard is not sought but the program does its calculations in the shortest time possible. That is, if the compiler of the web swallows with this and does not overflow the stack, perfect.The case, in an array of this type, the data is organized in memory of the following form:| fila 0 | fila 1 | fila 2 | ... | fila n-1 |
and I intend to exchange rows for columns:| columa 0 | columna 1 | columna 2 | ... | columna n-1 |
And the reason has to do with how a computer works inside.Teams have several types of memory: Discos, RAM, ... usually how much more capacity has a slower memory is usually their access, this is usually because the fastest memories are also more expensive.There's a particularly fast memory that usually goes next to the micro, it's cache memory. This memory, with a capacity of a few megas, is the one that provides the micro all the data you need. This memory is divided into pages, so that to load data you need (on a page), you probably have to rule out other data. This operation is known as a page and, in terms of time, it is usually quite costly, so it should be avoided to the extent possible.Well, to try to minimize page exchanges within your application you have to try to make data used together close to each other in terms of memory positions. The further the more likely the program will have to use two or more different cache pages only to perform the operation.Since the program operates only at the column level, the logical thing then is to organize the data by columns. In this way we get that the rows belonging to a column are in contiguous positions of memory. We've solved the page problem, or not.Now go fill in the matrix and, of course, if you keep filling it up as you will now have the problem of the page as you will be jumping from column to column... You have to go through the matrix in a sequential way, that is, go first the columns and then the rows. Filling it now is a little more complicated, but just a little, a proposal:int* ptr = reinterpret_cast<int*>(arr);
for(int col=0; col<n; col++)
{
int sig = col+1;
for(int *end = ptr+n; ptr<end; ptr++)
{
*ptr = sig;
sig += n;
}
}
Now it is not enough simply to increase a value, we must calculate the initial value for each column. For each row it is enough to increase the previous value n.And with this, now yes, we've solved the page problem. Now you should notice how the program runs pretty fast.Avoid using stringWhile the use of std::string on char*, in the case of this type of program we can get to ignore this recommendation.In your case it's easier still since you don't even have to use char*Your program will only receive two orders: P and R These orders come into you. char. Compare two char is instant, while comparing two strings of characters implies at least the existence of a loop:char op;
std::cin >> op;
if( op =='R' )
{
// ...
Other improvementsTry to avoid redundant operations. If you need to run the same operation several times, try saving intermediate results so you don't have to repeat them.Carefully choose the compiler. Each compiler has some features and the resulting binary may vary quite a bit. In the case of C++ soil have clang preference over g+ ... and preferably the latest version available, in this case the one that supports c++17, you will have to prove which is the one that gives you the best results.Think different approaches. Sometimes, it doesn't have to be this case, a problem can be incredibly complex if it is faced from the obvious perspective, sometimes it is necessary to give it a couple of turns and pose different solutions.In this case, for example, you might choose not to initialize the entire matrix. It is very difficult for test cases to touch absolutely all columns. As long as there's no operation swap , a column can remain uninitialized, which will save you some beautiful clock cycles. That is, the initialization of the matrix could be limited to something like this:int *data = new int[size*size];
int* ptr = data;
for( int i=0; i<size; i++, ptr += size)
{
ptr = 0;
}
This way we can know if a column is initialed simply checking its first value. If it is initialized we limit ourselves to recovering the requested value, but if it is not, we draw the default value:int ptrColumn = data + (column * size);
int cellValue = (*ptrColumn == 0)? (row * size + column + 1) : *(ptr + row);
// ~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~
// Valor por defecto Valor real
Of course, when it touches us to do a swap operation on an uninitial column, we will have to give values to the entire column before nothing.In the case of a small matrix, it will not be noticed (and you might even have some penalty), but in the case of large matrices with few operations the improvement can be abismal.As a test, I have tried to do the exercise and I have achieved a time of 0.22 applying the changes I have told you. It's easy to see that that result has today's date. It is likely that it could continue to optimize, but as a practical case I think it is enough.