How do computer programs work? a Super deep dive explainer by the amazing Lexi Mattick
Kernal system rings are denoted by the 2 least significant bits! 11 = usermode (ring 3) 00 = kernal mode ring 0 01 & 10= rings 1 & 2 used for drivers, rarely used idea: why cant
(RAW shity Notes you have been warned!)
[[Hardware]] [[Software]] [[Technology]] [John von Neumann]]] [[MoC]] where i left off 20230827 https://cpu.land/the-translator-in-your-computer
[[libraries and tools]] [[link]] this would make a great conference talk https://cpu.land/editions/one-pager
question: what where the alternatives to the [[von Neumann]] archetctures? and which ones map best for parallel processing see [[Unix Parallel]] as a good simple model for making things parallel
greate meta thought quote
(This was one of those psyching-myself-out moments for me — seriously, this is how the program you are using to read this article is running! Your CPU is fetching your browser’s instructions from RAM in sequence and directly executing them, and they’re rendering this article.)
question: can a OS be done in hardware?
Learning: XNU - Wikipedia is the macos system kernal ![[Pasted image 20230826192100.png]]
Kernal system rings are denoted by the 2 least significant bits! 11 = usermode (ring 3) 00 = kernal mode ring 0 01 & 10= rings 1 & 2 used for drivers, rarely used
idea: why cant the os compile the whole os to use a program, eliminating the extra indirections
IVT Interrupt vector table - Wikipedia is the location in memory that is used for syscalls
list of Linux system calls on Michael Kerrisk’s online manpage directory [[Linux Syscall List]]
Complex instruction set computer - Wikipedia
question: are IVT's address locations user defined? or a hardware feature?
question: how much does the OS indirection actually slow down the code exicution?
libc unix syscall library
question: are hardware interupts exicute on or between a cpu cycle?
question: can you make a keyboard input characters so fast that they block the cpu? (this is only for psu style keyboards)
preemptive multitasking is the tech term that allows single core cpu's to do multitaksing fixed timeslice round-robin scheduling. 10 ms range, timeslices are often called “quantums?” minimum granularity = timeslice duration a lower bound jiffy time unit linux = 1000 Hz linux's scheduler target latency is 6ms, and a minimum granularity
question: what is the scheduler target latency for windows? does the change to the linux kernal that allows more RTOS enable a different scheduler style or a smaller time slice?
Since 2007, Linux has used a scheduler called Completely Fair Scheduler
research topic: how does the linux scheduler work?
question: are there any advantages for using cooperative multitasking in todays age?
idea: what about a programming language that compiles like rust that automagically™️ does the 'explicit yield' back to the OS. use case: arduino's and other ioT devices
How to run a program
observation: omg, a .exe is just a big list of opcodes and data!
question: what programming language was the NT kernal written in? i assume C/C++ was only for unix systems... mm what was the windows 3.1 kernal written in?
[[Linux Source Code Network Diagram]]
Cool Note:
You know that convention where a program’s first argument is the name of the program? That’s _purely a convention_, and isn’t actually set by the `execve` syscall itself! The first argument will be whatever is passed to `execve` as the first item in the `argv` argument, even if it has nothing to do with the program name.
examples of this are Golan's GO=134.2 golan.exe blah blah |this bit|
[[Cyber Security]] vonerablity with order of program bash call '<program name> <arg>'
I was curious why the arity is hardcoded in the macro name; I googled around and learned that this was a workaround to fix some security vulnerability.
Thought: contribute to the linux kernal open source!
Question: how do you call <> with non standard args
do_execveat_common(fd, filename, argv, envp, flags);
i think this is where the linux kernal deals with path variables, and scoping of where to look for your program to run
idea: what would it look like to have just the linux kernal and your program and nothing else?
note: linux load 256 bytes of code at a time and runs them
#define BINPRM_BUF_SIZE 256
idea: because BINPRM_BUF_SIZE is in user space, can you change it? idea: can you write a program that fits intirly within the 256 bytes and what weird stuff can you do?
How linux handels differnet file types
load_binary() runs and looks for [[magic numbers]] if load_binary() cant find a file handler for this file type it returns with a "return code" success or ERROR
running a program that linux kernal checks the first two characters if they are a
#!/bin/bash
then it is handeled by a SYSCALL!
question:
- can this feature be abused to write a cool program that is compeltly wirten within the shebang?
- and is there any security wholes there?
- shebang line more than 256 characters long, everything past 256 characters will be completely lost.
- hmm, what happens to the ignored bytes?
- [[Max]] should read this article
person to ask about syscalls [[Oleg Nesterov]]
syscall execve ( ,,,, argv) removes the the first argument, and passes whatever is next unmoderated!
// Arguments: filename, argv, envp
execve("./script", [ "A", "B", "C" ], []);
idea: create my own file extension, 'handler', with custom [[magic numbers]] and byte encoding
note: check out and read the POSIX - Wikipedia standard document, to see what it actually says is POSIX [[POSIX]]
How the bash shell knows how to execute a program without a magic number or shabang line
$ chmod +x ./file
$ ./file
- call exec syscall
- if fail, execute as a shell
- shell invoked with the command name as its first operand, with any remaining arguments passed to the new shell
- return an exit status of 126. executable formats
- Linux
- LinuxExecutable format
- [[ELF]]
- idea: stuff weird stuff into the ELF header
- USE the PT_NOTE field because it can contain anything
- weird bejavor
- If the memory region length is longer than the length in the file, the extra memory will be filled with zeroes.
- called BSS segments.
- Question: how big is a tipical [[ELF]] file header in bytes?
- Mac
- Mach-O
- Windows .exe
- Portable Executable format
wc -l binfmt_* | sort -nr | sed 1d
wc -l binfmt_*
counts the number of lines in all files starting withbinfmt_
and prints the line countssort -nr
sorts the numeric outputs in reverse numerical order (highest to lowest)sed 1d
removes the first line of output from the sorted results
Question: whats the minimal feature set to let 95% of linux programs run? syscalls, bash features, and strip everything else out., is that something 1 person could build (me)
Question: is the [[Section Header Table]] within [[ELF]] optional?
idea: use gradent decent to generate random text strings that actually run, permutate the initail string to geratate something that works.
Observation: executable files are just a protocol for where data is locate
idea: make a small machine learning model include with the operating system that becomes the hub for deterministic code generation, and acts like a dynamic linking [[Machine Learning Zip Compression]]
Static Linking vs Dyamic Linking
Question: are these blocks of code .c files? or other elf binaries, or they whole programs like 'ls -hal'?
linux uses .so (Shared Object) mac .dylib windows .dll
how does ELF interpreter (PT_INTERP
)work
memory management unit (MMU) memory is fake, the cpu creates a Page Table intersting: tipical pagess size 4k KiB. [[Hack]] x86-64 also allows operating systems to enable larger 2 MiB or 4 GiB Pages
Author
by oran collins
github.com/wisehackermonkey