Create an account

Very important

  • To access the important data of the forums, you must be active in each forum and especially in the leaks and database leaks section, send data and after sending the data and activity, data and important content will be opened and visible for you.
  • You will only see chat messages from people who are at or below your level.
  • More than 500,000 database leaks and millions of account leaks are waiting for you, so access and view with more activity.
  • Many important data are inactive and inaccessible for you, so open them with activity. (This will be done automatically)


Thread Rating:
  • 316 Vote(s) - 3.4 Average
  • 1
  • 2
  • 3
  • 4
  • 5
CYFA - Creating Your First Assembler - Getting Help

#1
Ok, so in the

[To see links please register here]

we built structures to hold and encode our instructions, but we still have a little left to do with them. In this part, we're going to take care of all the helpers we're going to need to wrap up our assembler's encoding mechanism. I've listed them below
  • initializers for the individual structures
  • hidden functions to detect endianness and perform conversion if necessary
  • function to convert our structure to a 32-bit integer
  • enumeration for condition codes

Alright, now this is a pretty big list, so I won't be walking you through it step by step but rather I'll explain the first step and then just complete the rest for you. You're welcome (and encouraged) to try this for yourself. I'm going to break this down into sections based on the above list. Enjoy, and don't forget to discuss this at the end.



1. Initializers
Ok, so for this one, we're going to initialize to a default state depending on the instruction we choose. Some instructions have hardcoded values, others we will just want to init to a valid state so we can reduce our code in processing. Let's start off by making x_init functions for each structure and making an enum to hold each type. Your new instruction.h file should look like this:

Hidden Content
You must

[To see links please register here]

or

[To see links please register here]

to view this content.

I hope that's pretty clear as to how it works. We don't need to set enum values because these won't be getting combined, and we add an UNDEF enum so that we can process errors as time goes on. Now, we can go ahead and create .c files for each of the instruction headers. The contents of these C files is shown below, as well as a project screenshot to show their organization
[Image: fJrsrYj.png]
instruction.c

Hidden Content
You must

[To see links please register here]

or

[To see links please register here]

to view this content.

instruction/data.c

Hidden Content
You must

[To see links please register here]

or

[To see links please register here]

to view this content.

instruction/transfer.c

Hidden Content
You must

[To see links please register here]

or

[To see links please register here]

to view this content.

instruction/branch.c

Hidden Content
You must

[To see links please register here]

or

[To see links please register here]

to view this content.

Ok, now we can go ahead and finish the main initializer function. This one takes in 2 arguments and will initialize the encoding union with the correct values. Let's go ahead and write that code using a basic switch. The instruction.c file should now look like this

Hidden Content
You must

[To see links please register here]

or

[To see links please register here]

to view this content.


Now, before we move on, I want to point out why I didn't put the instruction_init_x functions in their respective header files. This comes down to scoping, these functions should only be called by the primary init, so there's no need for them to be in a header file. By putting them here, we guarantee that the signature won't be exported by the program, and thus won't clutter up the namespace of future code. This is how we handled namespaces before C++ came out, if you don't need it, don't include it.



2. hidden functions to detect endianness and perform conversion if necessary
Ok, so if we've filled in this structure and then realize we're on a big endian system (they exist), then we have a problem because by default ARM is little endian. We need to make a function that detects that and compensates accordingly. The good thing is, all we have to do is swap the byte order, so we'll be using the network functions htonl and ntohl. We'll do it pretty simply. There are compiler macros to detect this, but we won't be using them, just for the sake of cross compatibility. Let's go ahead and add this signature and a bare function to our instruction.c file.
Now, the includes for this are a little weird, we have to use 2 different files: arpa/inet.h for linux, BSD, and osx and Winsock2.h for Windows systems. We'll be using a header switch for that. Our new instruction.c file should look like this:

Hidden Content
You must

[To see links please register here]

or

[To see links please register here]

to view this content.

Now, the standard questions: why did you put the function signature inside the .c file?
Well, that's because this is a hidden function. We could even go so far as to undef it at the end of this, but we don't need to. The point is that this function isn't intended to be called outside of instruction.c, so we aren't going to export its symbol. This is how we keep our namespace clean.
Second question: what do htonl and ntohl do?
These function names are actually acronyms for Host To Network Long and Network To Host Long. Let's define some of those terms for you. A long (in the original C networking spec) is a 32-bit unsigned integer (uint32_t), and the network order is big endian. What we're doing with this function is looking for change. If we convert our input to the network order (big endian), and it doesn't change, then we know it was already a big endian value and we need to convert it to little endian. If it does change, then we know it was already little endian and we don't need any conversion.
Third question: why won't it build?
If you try to build this library, you should get something similar to the following:
[Image: Fp7E5QC.png]
Relax, the code is correct. This actually comes down to why we did it this way. We're using a shared library to handle the conversion. This makes our code more platform independent. If we had used macros, then the code would only work for the exact system it was compiled on, but if we use a shared library, it will work on any system that has that library. This means that every system could have a different version of the library, and would always correlate to the endianness of the system it's being run on. We need to link that library. Go ahead and go to the build options and select as.
Go into Linker settings
[Image: yhjh09U.png]
If you're using Windows, you will want to add ws2_32 and if you're on linux you will want to add -lsocket to your build flags for this file
[Image: X2WYBJx.png]
Go ahead and exit that dialog and save the project.
[Image: u4xdPSd.png]
Congrats! We're done with section 2 of this installment.



3. function to convert our structure to a 32-bit integer
Ok, so once we've populated our instruction union, we need something that converts it into a form that it's easy to write out. We know that all instructions are 32-bits long, so we can stick with our trusty uint32_t. Unfortunately, we don't have a very good conversion for this, so we'll have to make a hidden union for it. Our conversion function also has to take endianness into account, good thing we wrote that function a little while ago. Let's start with the hidden union.
We're going to define this in instruction.c, and it's task is going to be converting our original union into type uint32_t. This union should have two members then, our source union and our destination integer.

Hidden Content
You must

[To see links please register here]

or

[To see links please register here]

to view this content.

Ok, that was easy, let's go ahead and make the signature for this function. This is a function that's intended to be called by our main routine, so we'll want to export this to a higher namespace by placing the signature in instruction.h. The signature for this will be the following

Hidden Content
You must

[To see links please register here]

or

[To see links please register here]

to view this content.

Why did we use a pointer you ask? We did that because this function is intended to be called outside of our scope or control. We don't want to make guarantees to the main program that we won't modify this, and we want to follow C spec that all abstract data types are passed as reference.
Let's go back to instruction.c and write this function.

Hidden Content
You must

[To see links please register here]

or

[To see links please register here]

to view this content.

This is a pretty simple function, but it actually does a lot of work. This function takes our input instruction structure (with all of its flags), converts it to the type of uint32_t (to make it easier to write out), and rearranges it to fit the proper endian for ARM execution.
This part was pretty small, because we laid all of the groundwork for it in the previous sections and parts. This is the very reason it's so important to plan your projects out before you start them. Since we knew exactly how the pieces were supposed to fall together, we were able to design it down to the line to make the whole program much shorter.



4. enumeration for condition codes
Now, this one will be the easiest, however it will be the most tedious. In instruction.h, we're going to make an enumeration that contains all of the instruction condition codes so that it will be easier to apply these to both parsing, and into the instruction structure. You can find them by using the following table.
[Image: iPjcqlY.png]
I'll write the code and put it below, but at least read through the table

Hidden Content
You must

[To see links please register here]

or

[To see links please register here]

to view this content.

Congrats! We've completed part 7!
Now, I included all of the condition codes, even though we aren't going to worry about many of them. I did this so that I would not have to number the enum. Another question you may ask is why I prefixed them all with kCONDITION. I did this because enumerations are always in the global namespace, so we need a unique prefix for them. It's standard procedure to prefix the prefix with lowercase k. Interestingly enough, this actually started at Apple.



In the next part, we will start to look at parsing. I'm hoping this series will be no longer than 10 parts, so the last couple will likely be long reads. At the end of the series, I will make a post (in this section) containing the full source of this project for your reference.

After careful deliberation, I've decided to release this today rather than next week. Merry Christmas.

EDIT: In order to help generate activity both in the programming section and on these threads, I will add a giveaway to this. The drawing will happen once 15 replies have been posted, and the prize will either be 25NSP or $10 paypal. The drawing will be at random (due to rule violation concerns).
Rules:
1. Your post must contribute in some way either to the programming section as a whole or this thread/series
(this means that short posts similar to "count me in" will not be counted. The post does not have to be long, but needs to be something of value. The idea is to help SL grow its programmer base)
2. Your posts must obey all SL rules
Reply

#2
I'm going to bump this. The contest has been OK'd by staff, so let's hear what you have to say. Every qualifying post you make increases your chances of winning.
Reply

#3
I really hate to do this, bumping this once more, since more people are active due to the christmas giveaway.
Reply

#4
"Relax, the code is correct. This actually comes down to why we did it this way. We're using a shared library to handle the conversion. This makes our code more platform independent. If we had used macros, then the code would only work for the exact system it was compiled on, but if we use a shared library, it will work on any system that has that library. This means that every system could have a different version of the library, and would always correlate to the endianness of the system it's being run on. We need to link that library. Go ahead and go to the build options and select as."

This is the most smooth, relaxing paragraph I have ever seen written about C code. I have no idea why it has that effect, it sends chills down my spine.

That was a lot of code, looking forward to the next part!
Reply

#5
pretty much straightforward.
Reply

#6
Quote:(12-01-2017, 10:59 PM)Ender Wrote:

[To see links please register here]

"Relax, the code is correct. This actually comes down to why we did it this way. We're using a shared library to handle the conversion. This makes our code more platform independent. If we had used macros, then the code would only work for the exact system it was compiled on, but if we use a shared library, it will work on any system that has that library. This means that every system could have a different version of the library, and would always correlate to the endianness of the system it's being run on. We need to link that library. Go ahead and go to the build options and select as."

This is the most smooth, relaxing paragraph I have ever seen written about C code. I have no idea why it has that effect, it sends chills down my spine.

That was a lot of code, looking forward to the next part!

I had to write it like that because I was afraid of trolls running around telling me I could've just done

Hidden Content
You must

[To see links please register here]

or

[To see links please register here]

to view this content.

Which actually wouldn't work for our case. Using shared libraries defeats this issue, since the code is dependent on the host, not our program. The only thing worse than non-optimal code is trolls who think your code is non-optimal
Reply



Forum Jump:


Users browsing this thread:
1 Guest(s)

©0Day  2016 - 2023 | All Rights Reserved.  Made with    for the community. Connected through