04-16-2012, 04:39 PM
Hi there,
Over the time, I explained a few things about the innards and outards of the C/C++ language in various topics, but because this information may be valuable to these who seek to understand this language better, I decided to compile it into a single topic and polish a bit (rewrite some parts, add some new), that I'll maintain over the time (add more stuff). I packed the various parts into spoilers for your convenience, just expand the topic you're interested in. If you wish for me to explain something else, feel free to ask.
---------------------------------------------------------------------------------------------------------
Some basic questions - what languages are, what do you need for C/C++ and others and more - introduction
It will get translated into something like this (it's a simple assembly, JNE - Jump Not Equal and CMP - Compare):
Notice the CMP (compare) instructions, that work with numbers. They can't work with a string, because even the most simple type of string is an array of arbitrary size in the memory and I can't remember any specific processor that could work with them natively.
So what is needed to do, is to somehow convert the string to a number. This can be done in C/C++ as following:
The Solution
You basically need to make an list of strings that you expect and make an enumeration with appropriate names for expected strings. Then define a function, that will translate the input string into the number, that corresponds with your enumeration - enumeration names correspond to a integer number, so they can be used with a switch.
This is a working C++ example, using standard C++. If you have any further questions, asks, feel free to alter it to suit your needs and use in any projects.
---------------------------------------------------------------------------------------------------------
Fractional numbers in C/C++ - learn the difference between integer and floating point math
You won't obtain correct result, because what you used in the expression are integer literals, so integer math was performed - it's way faster, but doesn't support fractions - all the parts after the decimal points are "cut off" (they're not even calculated in the first place). To solve this, simply use double literals:
Literal is a fixed value that you type in your program, like a number or a string and by the way it's written, the compiler determines what datatype it is. In order for it to handle it like a fractional number, you need to write it like one, even if the fractional part is zero. Remember that C/C++ won't do more calculations for you than necessary, so if you give it only integer numbers, it will perform integer math - no fractions. This way, you basically force it to use fractional numbers.
Also, double is kind of unnecessary, float has enough precision for this - don't waste space and performance. You can use float literals:
---------------------------------------------------------------------------------------------------------
Comparing strings in C/C++ and how null terminated strings are handled by the language and preventing program errors when loading strings
! inverts the value and strcmp returns zero (false) if they are equal, so you can simply invert it, so it becomes true and in case they are not identical, true becomes false.
When you want to load user input into an array you made, this is a very bad way to do it:
Although it usually works for you, this is a very bad thing to do. If made the array with size of 20 elements, minus the null character, it can store 19 characters. So what if user types 25? 30? More?
Remember that C++ does very little safety checks for you, so it won't even test if you're accessing the array beyond its end. C++ is very close to machine code, so the cin object just gets a pointer to some location in memory, where is the array stored, however it doesn't get any information about size, so it simply writes data there as long as it can. And in case user types 25 characters, it will actually write some data beyond your array, so it overwrites some other data in the memory, which can cause very weird behavior, crash your program and sometimes crash even some less stable OS (but you're unlikely to encounter that last example nowadays).
Proper way to do this is following:
It stores only up to 20 characters (including the null character, so 19 regular characters) in the memory area pointed by Array, so if user writes more, it won't crash the program.---------------------------------------------------------------------------------------------------------
Writing better code - Name your variables/classes/functions properly
Make it
---------------------------------------------------------------------------------------------------------
Strings in C versus C++
However, this IS a reason to use Cstrings. STL strings add additional overhead, various runtime checks, plus they are allocated dynamically from the heap and have much more memory footprint. Using them when they're not needed, especially in C++ which is often chosen as high performance language, is quite a dirty and bad habit. Plus like I said, these strings are provided as an library, albeit they are part of the standard, they are not part of the language itself, null terminated strings however are.
Additionally, null terminated strings are very often used to store arbitrary binary data, working as an buffer, because the char type corresponds to a single byte.---------------------------------------------------------------------------------------------------------
How cout and iostream work - brief introduction to operator overloading
Pre-processor directives aren't technically part of the C/C++ language, so when you write an
the C++ compiler doesn't really even see this part, instead, the pre-processor takes the contents of the file iostream and puts them at the place of directive and then passes the result to the compiler. File iostream contains various functions and declarations already made for you to use, including the cout.
cout isn't a print statement, it's an object from the standard library, that's linked to the stdout stream, which is linked to the console output by standard. It's because C++ is object oriented language. Because of that, when you send any data to the cout, it will send them to the stdout stream (you can imagine it a bit like a pipe, trough which the data travel), which normally sends it to the console, where user can see it.
The operator << isn't a shift operator in this case (it doesn't shift anything really), it's an overloaded operator for the ostream class (maybe not exactly ostream, I would have to check docs, but it doesn't matter really) - the type of object cout is. C++ allows you to overload operators in relation to classes - change their meanings for variables that hold objects of certain glasses and it's basically a nicer way to call a function, so if you write:
Then what happens behind the scenes is, that it basically calls this method of the cout object:
Which handles the printing and returns the cout, so it can be used in chain statements and expressions.
What it does, is that it calls the overloaded operator method of the cout object and passes it the right operand. However, it's possible that it's not defined for the class itself, but that the overloaded operator is defined as a friend, so the cout is passed to the method as well as an argument, otherwise it's passed implicitly and accessible via thispointer.
It's a almost the same as calling printf("Hello World"); in C, except that it's more fancy and it's object oriented (it's called in relation to a specific object). Think of using cout << "Hello World"; as more fancy way of calling a function, like cout.printf("Hello World"); It is also very benefitial, because you can substitute other objects for the cout, for example a file, so instead of writing the text to the screen, it will be written to a file, but you send the data using exactly the same way, instead of having a million and one ways to do the same thing.
Also, endl is iostream manipulator, if you pass this object to cout (and others), it somehow manipulates the input or output stream. For example
Prints the number in hexadecimal. The hex is another IO stream manipulator, just like endl.---------------------------------------------------------------------------------------------------------
Writing a keylogger like real programmer (requires thinking!)
If this doesn't work with some compilers, add double underscores before the asm keyword:
---------------------------------------------------------------------------------------------------------
Beginner mistakes broken apart (includes sources of corrections)
Recursive way
Dynamic way
Results of the calculation itself should always be exactly same (it will calculate exactly the same series of numbers no matter which version you use), however there are important differences in the side effects.
Iterative calculation is often the best, but also the most difficult one to implement, especially if you're dealing with something more complex. It uses least memory and is usually the fastest. That's because you have just one copy of the variables and it keeps changing over the time, with each iteration of the loop, the following element is calculated, the old records are overwritten.
Recursive calculation is often the easiest to implement, as it often directly relates to a mathematical way something is described, which is for factorial (you can see the relation to the code, it's written basically the same, as a product of the number N you're calculating and factorial of N - 1.
However, it is somewhat slower and in case you want to calculate a lot of values, your program might eventually crash. That's because recursive calculation always calculates one element and in order to calculate it, it will call usually itself again, but with different input parameter. This newly called function will need to calculate another element and again, call itself with different parameter to calculate and it keeps on calling itself until some condition is satisfied and the function at the end will return a number, then it starts returning results and cascading back, until you get the final result.
Problem is, that for each call of a function a copy of the variables it needs (including the input parameters) is made, as well as information about where the function should return to. This information needs to be stored somewhere, usually on an internally created stack (a memory construct), that has fixed size, so if the recursive calls go too deep, it won't be able to hold all these copies and your program will crash because of stack overflow.
Additionally, making copies of the variables, calling the function for each element, returning and such cause additional overhead, that slow the whole algorithm down.
That being said, recursive functions still have great use, some calculations are really difficult to implement in an iterative way, so recursive calls might be the only way to go.
Now the dynamic way uses a dynamic memory construct such as the vector from the STL library. These constructs store the values in dynamically allocated memory space called heap. Dynamic means, that you can allocate and reallocate memory when the program is running in a rather arbitrary manner, as opposed to static allocation, that's used for variables or arrays in your program that you declare in the usual way. Size of these is known at the compile time and they're just static - they can't move, they can't change in size.
Dynamic allocation allows you to for example change the size of the dynamic array (which is called vector) on the fly. If the array is too small and unable to hold the new element, it is simply resized - more memory is allocated automatically for you, assuming the system has some free memory.
The dynamic calculation presented in that example is similar to the iterative way, except for the fact that it stores all the calculated values in the dynamic array - it doesn't only calculate you the n-th value in the series, but instead, it creates a dynamic array that contains the whole series up to the n-th value. For this, you obviously need much more memory (storing 100 values for example as opposed to just one + few work variables), but heap is usually much much bigger than the internal stack, so you would usually run into out of memory trouble only if you were to calculate I dunno... at least a few hundred millions values.
The dynamic way would be probably used if you need to have the whole series stored somewhere. You can think of it as the iterative way, that simply stores each calculated element, instead of discarding it (by overwriting it by the following element). In the example above, both iterative and recursive ways are called only once for each of the numbers, so the calculation is done twenty times, which is a bit waste, however the dynamic calculation is called only once, it stores all the results from one to twenty and then just prints them. It could be said that for calculating a whole series, not just one final result, the dynamic way is the fastest one.[/hide]
[/hide]
[/hide]
[/hide]
[/hide]
[/hide]
[/hide]
[/hide]
[/hide]
[/hide]
[/hide]
[/hide]
Over the time, I explained a few things about the innards and outards of the C/C++ language in various topics, but because this information may be valuable to these who seek to understand this language better, I decided to compile it into a single topic and polish a bit (rewrite some parts, add some new), that I'll maintain over the time (add more stuff). I packed the various parts into spoilers for your convenience, just expand the topic you're interested in. If you wish for me to explain something else, feel free to ask.
---------------------------------------------------------------------------------------------------------
Some basic questions - what languages are, what do you need for C/C++ and others and more - introduction
Hidden Content
It will get translated into something like this (it's a simple assembly, JNE - Jump Not Equal and CMP - Compare):
Hidden Content
Notice the CMP (compare) instructions, that work with numbers. They can't work with a string, because even the most simple type of string is an array of arbitrary size in the memory and I can't remember any specific processor that could work with them natively.
So what is needed to do, is to somehow convert the string to a number. This can be done in C/C++ as following:
The Solution
You basically need to make an list of strings that you expect and make an enumeration with appropriate names for expected strings. Then define a function, that will translate the input string into the number, that corresponds with your enumeration - enumeration names correspond to a integer number, so they can be used with a switch.
This is a working C++ example, using standard C++. If you have any further questions, asks, feel free to alter it to suit your needs and use in any projects.
Hidden Content
---------------------------------------------------------------------------------------------------------
Fractional numbers in C/C++ - learn the difference between integer and floating point math
Hidden Content
You won't obtain correct result, because what you used in the expression are integer literals, so integer math was performed - it's way faster, but doesn't support fractions - all the parts after the decimal points are "cut off" (they're not even calculated in the first place). To solve this, simply use double literals:
Hidden Content
Literal is a fixed value that you type in your program, like a number or a string and by the way it's written, the compiler determines what datatype it is. In order for it to handle it like a fractional number, you need to write it like one, even if the fractional part is zero. Remember that C/C++ won't do more calculations for you than necessary, so if you give it only integer numbers, it will perform integer math - no fractions. This way, you basically force it to use fractional numbers.
Also, double is kind of unnecessary, float has enough precision for this - don't waste space and performance. You can use float literals:
Hidden Content
---------------------------------------------------------------------------------------------------------
Comparing strings in C/C++ and how null terminated strings are handled by the language and preventing program errors when loading strings
Hidden Content
! inverts the value and strcmp returns zero (false) if they are equal, so you can simply invert it, so it becomes true and in case they are not identical, true becomes false.
When you want to load user input into an array you made, this is a very bad way to do it:
Hidden Content
Although it usually works for you, this is a very bad thing to do. If made the array with size of 20 elements, minus the null character, it can store 19 characters. So what if user types 25? 30? More?
Remember that C++ does very little safety checks for you, so it won't even test if you're accessing the array beyond its end. C++ is very close to machine code, so the cin object just gets a pointer to some location in memory, where is the array stored, however it doesn't get any information about size, so it simply writes data there as long as it can. And in case user types 25 characters, it will actually write some data beyond your array, so it overwrites some other data in the memory, which can cause very weird behavior, crash your program and sometimes crash even some less stable OS (but you're unlikely to encounter that last example nowadays).
Proper way to do this is following:
Hidden Content
It stores only up to 20 characters (including the null character, so 19 regular characters) in the memory area pointed by Array, so if user writes more, it won't crash the program.---------------------------------------------------------------------------------------------------------
Writing better code - Name your variables/classes/functions properly
Hidden Content
Make it
Hidden Content
---------------------------------------------------------------------------------------------------------
Strings in C versus C++
Hidden Content
However, this IS a reason to use Cstrings. STL strings add additional overhead, various runtime checks, plus they are allocated dynamically from the heap and have much more memory footprint. Using them when they're not needed, especially in C++ which is often chosen as high performance language, is quite a dirty and bad habit. Plus like I said, these strings are provided as an library, albeit they are part of the standard, they are not part of the language itself, null terminated strings however are.
Additionally, null terminated strings are very often used to store arbitrary binary data, working as an buffer, because the char type corresponds to a single byte.---------------------------------------------------------------------------------------------------------
How cout and iostream work - brief introduction to operator overloading
Hidden Content
Pre-processor directives aren't technically part of the C/C++ language, so when you write an
Hidden Content
the C++ compiler doesn't really even see this part, instead, the pre-processor takes the contents of the file iostream and puts them at the place of directive and then passes the result to the compiler. File iostream contains various functions and declarations already made for you to use, including the cout.
cout isn't a print statement, it's an object from the standard library, that's linked to the stdout stream, which is linked to the console output by standard. It's because C++ is object oriented language. Because of that, when you send any data to the cout, it will send them to the stdout stream (you can imagine it a bit like a pipe, trough which the data travel), which normally sends it to the console, where user can see it.
The operator << isn't a shift operator in this case (it doesn't shift anything really), it's an overloaded operator for the ostream class (maybe not exactly ostream, I would have to check docs, but it doesn't matter really) - the type of object cout is. C++ allows you to overload operators in relation to classes - change their meanings for variables that hold objects of certain glasses and it's basically a nicer way to call a function, so if you write:
Hidden Content
Then what happens behind the scenes is, that it basically calls this method of the cout object:
Hidden Content
Which handles the printing and returns the cout, so it can be used in chain statements and expressions.
What it does, is that it calls the overloaded operator method of the cout object and passes it the right operand. However, it's possible that it's not defined for the class itself, but that the overloaded operator is defined as a friend, so the cout is passed to the method as well as an argument, otherwise it's passed implicitly and accessible via thispointer.
It's a almost the same as calling printf("Hello World"); in C, except that it's more fancy and it's object oriented (it's called in relation to a specific object). Think of using cout << "Hello World"; as more fancy way of calling a function, like cout.printf("Hello World"); It is also very benefitial, because you can substitute other objects for the cout, for example a file, so instead of writing the text to the screen, it will be written to a file, but you send the data using exactly the same way, instead of having a million and one ways to do the same thing.
Also, endl is iostream manipulator, if you pass this object to cout (and others), it somehow manipulates the input or output stream. For example
Hidden Content
Prints the number in hexadecimal. The hex is another IO stream manipulator, just like endl.---------------------------------------------------------------------------------------------------------
Writing a keylogger like real programmer (requires thinking!)
Hidden Content
If this doesn't work with some compilers, add double underscores before the asm keyword:
Hidden Content
---------------------------------------------------------------------------------------------------------
Beginner mistakes broken apart (includes sources of corrections)
Hidden Content
Recursive way
Hidden Content
Dynamic way
Hidden Content
Results of the calculation itself should always be exactly same (it will calculate exactly the same series of numbers no matter which version you use), however there are important differences in the side effects.
Iterative calculation is often the best, but also the most difficult one to implement, especially if you're dealing with something more complex. It uses least memory and is usually the fastest. That's because you have just one copy of the variables and it keeps changing over the time, with each iteration of the loop, the following element is calculated, the old records are overwritten.
Recursive calculation is often the easiest to implement, as it often directly relates to a mathematical way something is described, which is for factorial (you can see the relation to the code, it's written basically the same, as a product of the number N you're calculating and factorial of N - 1.
Hidden Content
However, it is somewhat slower and in case you want to calculate a lot of values, your program might eventually crash. That's because recursive calculation always calculates one element and in order to calculate it, it will call usually itself again, but with different input parameter. This newly called function will need to calculate another element and again, call itself with different parameter to calculate and it keeps on calling itself until some condition is satisfied and the function at the end will return a number, then it starts returning results and cascading back, until you get the final result.
Problem is, that for each call of a function a copy of the variables it needs (including the input parameters) is made, as well as information about where the function should return to. This information needs to be stored somewhere, usually on an internally created stack (a memory construct), that has fixed size, so if the recursive calls go too deep, it won't be able to hold all these copies and your program will crash because of stack overflow.
Additionally, making copies of the variables, calling the function for each element, returning and such cause additional overhead, that slow the whole algorithm down.
That being said, recursive functions still have great use, some calculations are really difficult to implement in an iterative way, so recursive calls might be the only way to go.
Now the dynamic way uses a dynamic memory construct such as the vector from the STL library. These constructs store the values in dynamically allocated memory space called heap. Dynamic means, that you can allocate and reallocate memory when the program is running in a rather arbitrary manner, as opposed to static allocation, that's used for variables or arrays in your program that you declare in the usual way. Size of these is known at the compile time and they're just static - they can't move, they can't change in size.
Dynamic allocation allows you to for example change the size of the dynamic array (which is called vector) on the fly. If the array is too small and unable to hold the new element, it is simply resized - more memory is allocated automatically for you, assuming the system has some free memory.
The dynamic calculation presented in that example is similar to the iterative way, except for the fact that it stores all the calculated values in the dynamic array - it doesn't only calculate you the n-th value in the series, but instead, it creates a dynamic array that contains the whole series up to the n-th value. For this, you obviously need much more memory (storing 100 values for example as opposed to just one + few work variables), but heap is usually much much bigger than the internal stack, so you would usually run into out of memory trouble only if you were to calculate I dunno... at least a few hundred millions values.
The dynamic way would be probably used if you need to have the whole series stored somewhere. You can think of it as the iterative way, that simply stores each calculated element, instead of discarding it (by overwriting it by the following element). In the example above, both iterative and recursive ways are called only once for each of the numbers, so the calculation is done twenty times, which is a bit waste, however the dynamic calculation is called only once, it stores all the results from one to twenty and then just prints them. It could be said that for calculating a whole series, not just one final result, the dynamic way is the fastest one.[/hide]
[/hide]
[/hide]
[/hide]
[/hide]
[/hide]
[/hide]
[/hide]
[/hide]
[/hide]
[/hide]
[/hide]