6b13f685e
김민수
BSP 최초 추가
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
|
AMCC suggested to set the PMU bit to 0 for best performace on the
PPC440 DDR controller. The 440er common DDR setup files (sdram.c &
spd_sdram.c) are changed accordingly. So all 440er boards using
these setup routines will automatically receive this performance
increase.
Please see below some benchmarks done by AMCC to demonstrate this
performance changes:
SDRAM0_CFG0[PMU] = 1 (U-boot default for Bamboo, Yosemite and Yellowstone)
Stream benchmark results
This system uses 8 bytes per DOUBLE PRECISION word.
Array size = 2000000, Offset = 0
Total memory required = 45.8 MB.
Each test is run 10 times, but only
the *best* time for each is used.
Your clock granularity/precision appears to be 1 microseconds.
Each test below will take on the order of 112345 microseconds.
(= 112345 clock ticks)
Increase the size of the arrays if this shows that you are not getting
at least 20 clock ticks per test.
WARNING
For best results, please be sure you know the precision of your system
timer.
Function Rate (MB/s) RMS time Min time Max time
Copy: 256.7683 0.1248 0.1246 0.1250
Scale: 246.0157 0.1302 0.1301 0.1302
Add: 255.0316 0.1883 0.1882 0.1885
Triad: 253.1245 0.1897 0.1896 0.1899
TTCP Benchmark Results
ttcp-t: socket
ttcp-t: connect
ttcp-t: buflen=8192, nbuf=2048, align=16384/0, port=5000 tcp ->
localhost
ttcp-t: 16777216 bytes in 0.28 real seconds = 454.29 Mbit/sec +++
ttcp-t: 2048 I/O calls, msec/call = 0.14, calls/sec = 7268.57
ttcp-t: 0.0user 0.1sys 0:00real 60% 0i+0d 0maxrss 0+2pf 3+1506csw
SDRAM0_CFG0[PMU] = 0 (Suggested modification)
Setting PMU = 0 provides a noticeable performance improvement *2% to
5% improvement in memory performance.
*Improves the Mbit/sec for TTCP benchmark by almost 76%.
Stream benchmark results
This system uses 8 bytes per DOUBLE PRECISION word.
Array size = 2000000, Offset = 0
Total memory required = 45.8 MB.
Each test is run 10 times, but only
the *best* time for each is used.
Your clock granularity/precision appears to be 1 microseconds.
Each test below will take on the order of 120066 microseconds.
(= 120066 clock ticks)
Increase the size of the arrays if this shows that you are not getting
at least 20 clock ticks per test.
WARNING
For best results, please be sure you know the precision of your system
timer.
Function Rate (MB/s) RMS time Min time Max time
Copy: 262.5167 0.1221 0.1219 0.1223
Scale: 258.4856 0.1238 0.1238 0.1240
Add: 262.5404 0.1829 0.1828 0.1831
Triad: 266.8594 0.1800 0.1799 0.1802
TTCP Benchmark Results
ttcp-t: socket
ttcp-t: connect
ttcp-t: buflen=8192, nbuf=2048, align=16384/0, port=5000 tcp ->
localhost
ttcp-t: 16777216 bytes in 0.16 real seconds = 804.06 Mbit/sec +++
ttcp-t: 2048 I/O calls, msec/call = 0.08, calls/sec = 12864.89
ttcp-t: 0.0user 0.0sys 0:00real 46% 0i+0d 0maxrss 0+2pf 120+1csw
2006-07-28, Stefan Roese <sr@denx.de>
|