Linux nasm hello world

NASM Hello World on Linux: undefined reference to `main’

I’ve been following this tutorial for an intro to assembly on Linux.

I’ve then had problems compiling it. I’ve looked around and found (on SO) that I should compile it like this:

But I keep getting this error from GCC:

(NB: I’m running Debian Linux on a 64 bit Intel i7)

2 Answers 2

You should add -nostdlib when linking your binary.

If you are going to learn assembly, then you are much better served learning to use the assembler nasm and the linker ld without relying on gcc . There is nothing wrong with using gcc , but it masks part of the linking process that you need to understand going forward.

Learning assembly in the current environment (generally building on x86_64 but using examples that are written in x86 32-bit assembler), you must learn to build for the proper target and the language ( syscall ) differences between the two. Your code example is 32-bit assembler. As such your nasm compile string is incorrect:

The -f elf64 attempts to compile a 64-bit object file, but the instructions in your code are 32-bit instructions. (It won’t work)

Understanding and using ld provides a better understanding of the differences. Rather than using gcc , you can use nasm and ld to accomplish the same thing. For example (with slight modification to the code):

You compile and build with:

Note the use of -f elf for 32-bit code in the nasm call and the -m elf_i386 linker option to create a compatible executable.

output:

If you are serious about learning assembler, there are a number of good references on the web. One of the best is The Art of Assembly. (it is written primarily for 8086 and x86 , but the foundation it provides is invaluable). In addition, looking at the executables you create in binary can be helpful. Take a look at Binary Vi (BVI). It is a good tool.

Источник

Linux nasm hello world

If nothing happens, download GitHub Desktop and try again.

Launching GitHub Desktop

If nothing happens, download GitHub Desktop and try again.

Launching Xcode

If nothing happens, download Xcode and try again.

Launching Visual Studio

Latest commit

Git stats

Files

Failed to load latest commit information.

README.md

A simplistic Hello World for nasm w/ intel syntax

The makefile assumes you’ve got nasm and ld installed, and uses the ‘-m elf_i386’ linker flag for my own pc, you’ll probably have to adjust this.

What’s this crazy code do? (Breakdown)

This program is more or less the equivalent of the following python:

Let’s go line by line, since this is for total beginners.

The section ‘.data’ tells the computer that data, similar to variables, are to be stored here.

‘msg’ and ‘msgLen’ act similar to variables, but they are actually there just for the programmer to mark specific memory offsets of where we’ve stored data.

db defines bytes, generally (with ascii), each character is stored as 1 byte. So here we list the bytes we want to store in order. In this case «Hello, World!» followed by 0x0a, the newline character.

equ says that we want to evaulate the expression ‘$-msg’ which takes the current memory address (represented by the ‘$’ character) and subtracts the address labeled ‘msg’. Since msgLen is stored directly after msg in memory this difference will actually be the length, or number of elements in, the array of bytes we stored as ‘msg’.

Text Section (The first chunk):

The remainder of the program is technically the text section so we’ll go at it in chunks

Similar to ‘.data’, ‘.text’ is where our instructions (and the bulk of the program) go. Defining the global ‘_start’ is similar, and is like a main() function in other languages. It’s the required entry point for the program.

The program begins execution at the ‘_start’ label, by moving the data from ‘msg’ and ‘msgLen’ into the registers ‘ecx’ and ‘edx’ respectively. Similar to how you pass a string into python’s print function the string and length must be passed into these registers in order to output them to the screen.

After moving our data into those registers we call the print subroutine (think of subroutines like functions) and then the exit subroutine. But those haven’t been defined! How does our program know what that even means? It wouldn’t if we didn’t define them, but thankfully we did lower down. Since the assembler and linker will rearrange our subroutine definitions for us you can define them in whatever order makes the most sense (and looks cleanest) to you.

Text Section (The second chunk — print):

So we defined the print label, now our program knows where to go when we call print, what’s up with pushing those registers? This step is actually totally unecessisary for our program, but it’s good practice. We push the current values of eax and ebx onto the stack so that when we write to them in a second the data isn’t lost. The last thing you want to do is lose data in a register because you called a subroutine that modified it without your knowledge.

Similar to how we just had to put the string and length in ecx and edx, eax needs to contain the int 4 in it so the computer knows we want to print, and ebx needs 1 so it knows the printed data is going to stdout. Simple stuff!

int 0x80 actually ins’t the integer 128, I mean it is, but not as a variable. int here means interrupt, so the program triggers a cpu interrupt, causing the computer to read eax and determine what to do. It sees 4 in eax, checks ebx for the output (stdout here) and then prints what’s in ecx, up to the number of chars specified by edx. Get it? Good!

Then we pop the values off the stack back into the registers that we modified and return! Notice we don’t return a value? ret is required here to specify the end of our function, now the stack pointer can do it’s magic and get us back to where we originally called the print subroutine.

Text Section (The last chunk — exit):

At this point you should see what’s going on but we’ll run through it. The label exit is defined so that we can refer to this memory address in our program. Then we move the value 1 into eax and trigger a cpu interrupt causing something to happen based on the value of eax. What happens with interrupt code 1 on linux? The program exits with the code specified in ebx (0 for successful exit by standard).

Источник

Hello world in Linux x86-64 assembly

A “hello world” program writes to stdout (calling write ) then exits (calling exit ). The assembly program hello.s below does that on Linux x86-64.

The first important document is the x86-64 ABI specification, maintained by Intel. (Weirdly, the official location for the ABI specification is some random dude’s personal GitHub account. Welcome to the sketchy world of assembly.) The ABI specification describes system calls in the abstract, as it applies to any operating system. Importantly:

The system call number is put in rax .
Arguments are put in the registers rdi , rsi , rdx , rcx , r8 and r9 , in that order.
The system is called with the syscall instruction.
The return value of the system call is in rax . An error is signalled by returning -errno .

The second document is the Linux 64-bit system call table. This specifies the system call number for each Linux system call. For our example, the write system call is 1 and exit is 60 .

Finally, you want the man pages for the system calls, which tell you their signature, e.g.:

Armed with this, we know to:

put the system call number 1 in rax
put the fd argument in rdi
put the buf argument in rsi
put the count argument in rdx
finally, call syscall

Источник

Assembler Linux

Компиляторы ассемблера в Linux

В Linux традиционно используется компилятор ассемблера GNU Assembler (GAS, вызываемый командой as), входящий в состав пакета GCC. Этот компилятор является кроссплатформенным, т. е. может компилировать программы, написанные на различных языках ассемблера для разных процессоров. Однако GAS использует синтаксис AT&T, а не Intel, поэтому его использование программистами, привыкшими к синтаксису Intel, вызывает некоторый дискомфорт.
Например программа, выводящая на экран сообщение «Hello, world!» (далее будем называть ее hello) выглядит следующим образом:

.section .data
msg:
.ascii «Hello, world!\n»
len = . — msg # символу len присваевается длина строки
.section .text
.global _start # точка входа в программу
_start:
movl $4, %eax # системный вызов № 4 — sys_write
movl $1, %ebx # поток № 1 — stdout
movl $msg, %ecx # указатель на выводимую строку
movl $len, %edx # длина строки
int $0x80 # вызов ядра
movl $1, %eax # системный вызов № 1 — sys_exit
xorl %ebx, %ebx # выход с кодом 0
int $0x80 # вызов ядра

Как видно из примера, различия видны как в синтаксисе команд, так и в синтаксисе директив ассемблера и комментариях.
В последних версиях GAS появилась возможность использования синтаксиса Intel для команд, но синтаксис директив и комментариев остается традиционным. Включение синтаксиса Intel осуществляется директивой .intel_syntax с параметром noprefix. При этом программа, приведенная выше изменится следующим образом:

.intel_syntax noprefix
.section .data
msg:
.ascii «Hello, world!\n»
len = . — msg # символу len присваевается длина строки
.section .text
.global _start # точка входа в программу
_start:
mov eax, 4 # системный вызов № 4 — sys_write
mov ebx, 1 # поток № 1 — stdout
mov ecx, OFFSET FLAT:msg # указатель на выводимую строку
# OFFSET FLAT означает использовать тот адрес,
# который msg будет иметь во время загрузки
mov edx, len # длина строки
int 0x80 # вызов ядра
mov eax, 1 # системный вызов № 1 — sys_exit
xor ebx, ebx # выход с кодом 0
int 0x80 # вызов ядра

Другим широко распространенным компилятором ассемблера для Linux является Netwide Assembler (NASM, вызываемый командой nasm). NASM использует синтаксис Intel. Кроме того, синтаксис директив ассемблера NASM частично совпадает с синтаксисом MASM. Пример приведенной выше программы для ассемблера NASM выглядит следующим образом:

section .data
msg db «Hello, world!\n»
len equ $-msg ; символу len присваевается длина строки
section .text
global _start ; точка входа в программу
_start:
mov eax, 4 ; системный вызов № 4 — sys_write
mov ebx, 1 ; поток № 1 — stdout
mov ecx, msg ; указатель на выводимую строку
mov edx, len ; длина строки
int 80h ; вызов ядра
mov eax, 1 ; системный вызов № 1 — sys_exit
xor ebx, ebx ; выход с кодом 0
int 80h ; вызов ядра

Кроме перечисленных ассемблеров в среде Linux можно использовать ассемблеры FASM и YASM. Оба поддерживают синтаксис Intel, но FASM имеет свой синтаксис директив, а YASM синтаксически полностью аналогичен NASM и отличается от него только типом пользовательской лицензии. В дальнейшем изложении материала все примеры будут даваться применительно к синтаксису, используемому NASM. Желающим использовать GAS можно порекомендовать статью о сравнении этих двух ассемблеров. Кроме того, при использовании в GAS директивы .intel_syntax noprefix различия между ними будут не столь значительными. Тексты программ, подготовленные для NASM, как правило, без проблем компилируются и YASM.

Структура программы

Программы в Linux состоят из секций, каждая из которых имеет свое назначение [6]. Секция .text содержит код программы. Секции .data и .bss содержат данные. Причем первая содержит инициализированные данные, а вторая — не инициализированные. Секция .data всегда включается при компиляции в исполняемый файл, а .bss в исполняемый файл не включается и создается только при загрузке процесса в оперативную память. Начало секции объявляется директивой SECTION имя_секции. Вместо директивы SECTION можно использовать директиву SEGMENT. Для указания конца секции директив не существует — секция автоматически заканчивается при
объявлении новой секции или в конце программы. Порядок следования секций в программе не имеет значения. В программе обязательно должна быть объявлена метка с именем _start – это точка входа в программу. Кроме того, метка точки входа должна быть объявлена как глобальный идентификатор директивой GLOBAL _start. Так как имя точки входа предопределено, то необходимость в директиве конца программы END отпадает: в NASM данная директива не поддерживается.
При создании многомодульных программ все метки (идентификаторы переменных и функций), которые предполагается использовать в других модулях, необходимо объявить как глобальные с помощью директивы GLOBAL. Наоборот, все идентификаторы, реализованные в других модулях и объявленные там, как глобальные, необходимо объявить как внешние директивой EXTERN. Функция сложения двух чисел sum, рассмотренная в предыдущей лабораторной работе, в NASM будет выглядеть так:

SECTION .text
global sum
sum:
push ebp
mov ebp, esp
mov eax, [ebp+8]
add eax, [ebp+12]
pop ebp
ret

Использование библиотечных функций

В программах на ассемблере можно использовать функции библиотеки Си. Для использования функции ее надо предварительно объявить директивой EXTERN. Например, для того. чтобы использовать функцию printf необходимо предварительно указать выполнить следующую директиву:
EXTERN printf
Программу hello можно модифицировать так, чтобы она использовала для вывода информации не функцию API Linux, а функцию printf библиотеки Си. Код программы, назовем ее hello-c, будет выглядеть так:

SECTION .data
msg db «Hello, world!»,0
fmt db «%s»,0Ah
SECTION .text
GLOBAL _start ; точка входа в программу
EXTERN printf ; внешняя функция библиотеки Си
_start:
push msg ; второй параметр — указатель на строку
push fmt ; первый параметр — указатель на формат
22
call printf ; вызов функции
add esp, 4*2 ; очистка стека от параметров
mov eax, 1 ; системный вызов № 1 — sys_exit
xor ebx, ebx ; выход с кодом 0
int 80h ; вызов ядра

Компиляция программ, использующих библиотечные функции ничем не отличается от компиляции программ, использующих только функции API. Различия появляются только на этапе компоновки. Особенности компоновки будут рассмотрены далее.

Отличия NASM от MASM

Компиляция и запуск

nasm -f elf hello.asm
gcc hello.o
chmod +x a.out
./a.out

Данная статья не подлежит комментированию, поскольку её автор ещё не является полноправным участником сообщества. Вы сможете связаться с автором только после того, как он получит приглашение от кого-либо из участников сообщества. До этого момента его username будет скрыт псевдонимом.

Источник

Linux nasm hello world

NASM Hello World on Linux: undefined reference to `main’

2 Answers 2

Linux nasm hello world

Launching GitHub Desktop

Launching Xcode

Launching Visual Studio

Latest commit

Git stats

Files

README.md

Hello world in Linux x86-64 assembly

Assembler Linux

Компиляторы ассемблера в Linux

Структура программы

Использование библиотечных функций

Отличия NASM от MASM

Компиляция и запуск

Добавить комментарий Отменить ответ