About the Authors |
|
xxiii | |
Acknowledgments |
|
xxv | |
Chapter 1 Introduction to Performance Management |
|
1 | (12) |
|
1.1 Application Developer's Perspective |
|
|
1 | (2) |
|
1.1.1 Defining the Application |
|
|
2 | (1) |
|
1.1.2 Determining Application Specifications |
|
|
2 | (1) |
|
1.1.3 Designing Application Components |
|
|
2 | (1) |
|
1.1.4 Developing the Application Codes |
|
|
2 | (1) |
|
1.1.5 Testing, Tuning, and Debugging |
|
|
2 | (1) |
|
1.1.6 Deploying the System and/or Application |
|
|
3 | (1) |
|
1.1.7 System and Application Maintenance |
|
|
3 | (1) |
|
1.2 System Administrator's Perspective |
|
|
3 | (1) |
|
1.2.1 Making the System Available to the Users |
|
|
3 | (1) |
|
1.2.2 Monitoring the Usage of the System |
|
|
3 | (1) |
|
1.2.3 Maintaining a Certain Level of System Performance |
|
|
3 | (1) |
|
1.2.4 Planning for Future Processing Needs |
|
|
4 | (1) |
|
1.3 Total System Resource Perspective |
|
|
4 | (1) |
|
1.4 Rules of Performance Tuning |
|
|
5 | (8) |
Chapter 2 Performance Management Tasks |
|
13 | (6) |
|
|
14 | (1) |
|
|
15 | (1) |
|
2.3 Performance Characterization |
|
|
15 | (1) |
|
2.4 Performance Forecasting or Prediction |
|
|
15 | (1) |
|
2.5 Application Optimization |
|
|
16 | (1) |
|
|
16 | (1) |
|
2.7 Performance Problem Resolution |
|
|
17 | (2) |
Chapter 3 A Performance Management Methodology |
|
19 | (20) |
|
|
19 | (7) |
|
3.1.1 System Configuration |
|
|
20 | (1) |
|
|
21 | (1) |
|
3.1.3 Performance Expectations |
|
|
21 | (4) |
|
|
25 | (1) |
|
|
25 | (1) |
|
3.1.6 Duration of the Problem |
|
|
25 | (1) |
|
3.1.7 Understanding the Options |
|
|
26 | (1) |
|
3.1.8 Understanding the Politics |
|
|
26 | (1) |
|
|
26 | (5) |
|
|
27 | (1) |
|
3.2.2 Purpose of the Measurement: Baseline versus Crisis |
|
|
27 | (1) |
|
3.2.3 Duration and Interval of the Measurement |
|
|
28 | (1) |
|
3.2.4 Particular Metric Needs |
|
|
29 | (1) |
|
3.2.5 Metric Documentation |
|
|
29 | (1) |
|
|
29 | (1) |
|
3.2.7 Saturation of Certain System Resources |
|
|
30 | (1) |
|
3.2.8 Relationship Between the Metric and the Application |
|
|
30 | (1) |
|
3.2.9 Qualitative versus Quantitative Measurements |
|
|
30 | (1) |
|
|
30 | (1) |
|
3.3 Interpretation and Analysis |
|
|
31 | (2) |
|
3.3.1 Multi-User versus Workstation |
|
|
31 | (1) |
|
3.3.2 Interactive versus Batch, Compute-Intensive versus I/O-Intensive |
|
|
32 | (1) |
|
3.3.3 Application Architecture |
|
|
32 | (1) |
|
3.3.4 Performance of the CPU |
|
|
32 | (1) |
|
|
32 | (1) |
|
|
33 | (1) |
|
3.4 Identifying Bottlenecks |
|
|
33 | (1) |
|
3.4.1 Resource Saturation |
|
|
33 | (1) |
|
3.4.2 Growing Resource Queue |
|
|
33 | (1) |
|
3.4.3 Resource Starvation |
|
|
34 | (1) |
|
3.4.4 Unsatisfactory Response Time |
|
|
34 | (1) |
|
|
34 | (1) |
|
|
34 | (5) |
|
3.5.1 Determine Whether There Is a Particular Bottleneck |
|
|
35 | (1) |
|
3.5.2 Devise a Simple Test for Measuring Results |
|
|
35 | (1) |
|
3.5.3 Do Not Tune Randomly |
|
|
35 | (1) |
|
3.5.4 Use Heuristics and Logic |
|
|
35 | (1) |
|
3.5.5 Look for Simple Causes |
|
|
35 | (1) |
|
3.5.6 Develop an Action Plan |
|
|
36 | (1) |
|
3.5.7 Change Only One Thing at a Time! |
|
|
36 | (1) |
|
|
36 | (1) |
|
3.5.9 Understand the Limits of the Application's Architecture |
|
|
36 | (1) |
|
|
36 | (3) |
Chapter 4 Kernel Instrumentation and Performance Metrics |
|
39 | (12) |
|
4.1 Approaches to Measuring |
|
|
39 | (7) |
|
4.1.1 External Hardware Monitors |
|
|
40 | (1) |
|
4.1.2 Internal Hardware Monitors |
|
|
40 | (1) |
|
4.1.3 Internal Software Monitors |
|
|
41 | (5) |
|
4.2 Kernel Instrumentation and Measurement Interface |
|
|
46 | (2) |
|
4.2.1 Acceptable Overhead |
|
|
47 | (1) |
|
4.3 Performance Metrics Categories |
|
|
48 | (1) |
|
4.3.1 Categories of Metrics by Scope |
|
|
48 | (1) |
|
4.3.2 Categories of Metrics by Type of Data |
|
|
49 | (1) |
|
|
49 | (2) |
Chapter 5 Survey of Unix Performance Tools |
|
51 | (56) |
|
5.1 Multipurpose Diagnostic Tools |
|
|
54 | (23) |
|
|
54 | (5) |
|
|
59 | (7) |
|
5.1.3 GlancePlus Motif Adviser |
|
|
66 | (1) |
|
5.1.4 Advantages and Disadvantages of GlancePlus Motif |
|
|
67 | (1) |
|
5.1.5 sar (System Activity Reporter) |
|
|
68 | (6) |
|
|
74 | (1) |
|
|
75 | (1) |
|
|
76 | (1) |
|
|
77 | (4) |
|
|
77 | (1) |
|
|
78 | (1) |
|
|
79 | (1) |
|
|
79 | (2) |
|
5.3 Disk Diagnostic Tools |
|
|
81 | (4) |
|
|
81 | (2) |
|
|
83 | (1) |
|
|
84 | (1) |
|
5.4 Memory Diagnostic Tools |
|
|
85 | (6) |
|
|
85 | (1) |
|
|
86 | (1) |
|
|
87 | (2) |
|
|
89 | (1) |
|
|
90 | (1) |
|
5.5 Performance Characterization and Prediction Tools |
|
|
91 | (6) |
|
|
91 | (3) |
|
|
94 | (2) |
|
|
96 | (1) |
|
|
97 | (1) |
|
|
97 | (1) |
|
|
97 | (1) |
|
5.6 Process Accounting Tools |
|
|
97 | (1) |
|
|
98 | (1) |
|
|
98 | (1) |
|
5.7 Application Optimization Tools |
|
|
98 | (1) |
|
|
98 | (1) |
|
|
98 | (1) |
|
5.7.3 dpat (Distributed Performance Analysis Tool) |
|
|
98 | (1) |
|
5.7.4 hpc (Histogram Program Counter) |
|
|
99 | (1) |
|
|
99 | (1) |
|
|
99 | (1) |
|
|
99 | (1) |
|
5.7.8 TTV (Thread Trace Visualizer) |
|
|
99 | (1) |
|
5.8 Network Diagnostic Tools |
|
|
99 | (2) |
|
|
99 | (2) |
|
5.8.2 nettune (HP-UX 10.x) and ndd (HP-UX 11.0 and Later) |
|
|
101 | (1) |
|
|
101 | (1) |
|
5.9 Resource Management Tools |
|
|
101 | (6) |
|
5.9.1 Process Resource Manager (PRM) |
|
|
103 | (1) |
|
5.9.2 Workload Manager (WLM) |
|
|
104 | (2) |
|
5.9.3 Capacity Planning Tools |
|
|
106 | (1) |
Chapter 6 Hardware Performance Issues |
|
107 | (62) |
|
|
107 | (9) |
|
6.1.1 Central Processing Unit |
|
|
108 | (1) |
|
|
108 | (1) |
|
|
108 | (1) |
|
|
108 | (1) |
|
6.1.5 Translation Lookaside Buffers |
|
|
109 | (1) |
|
|
110 | (5) |
|
6.1.7 Input/Output Devices (I/O Devices) |
|
|
115 | (1) |
|
|
115 | (1) |
|
|
115 | (1) |
|
6.2 Processor Characteristics |
|
|
116 | (18) |
|
|
116 | (1) |
|
6.2.2 Out-of-Order versus In-Order Execution |
|
|
117 | (1) |
|
6.2.3 Scalar and Superscalar Processors |
|
|
118 | (2) |
|
|
120 | (2) |
|
6.2.5 Problems with Pipelined Architectures |
|
|
122 | (2) |
|
|
124 | (4) |
|
|
128 | (1) |
|
6.2.8 Summary of processors |
|
|
128 | (6) |
|
|
134 | (2) |
|
6.3.1 Master/Slave Implementation |
|
|
135 | (1) |
|
6.3.2 Asymmetric Implementation |
|
|
135 | (1) |
|
6.3.3 Symmetric Multi-Processing Implementation |
|
|
135 | (1) |
|
6.3.4 Massively Parallel Processing Implementation |
|
|
135 | (1) |
|
|
136 | (1) |
|
6.4 Cache Memory Performance Issues |
|
|
136 | (6) |
|
|
136 | (2) |
|
6.4.2 Visible Cache Problems |
|
|
138 | (3) |
|
|
141 | (1) |
|
6.5 Main Memory Performance Issues |
|
|
142 | (8) |
|
|
142 | (1) |
|
|
143 | (4) |
|
|
147 | (1) |
|
6.5.4 Visible TLB Performance Problems |
|
|
148 | (2) |
|
6.6 I/O Performance Issues |
|
|
150 | (2) |
|
6.6.1 I/O Backplane Types |
|
|
150 | (1) |
|
|
151 | (1) |
|
6.6.3 I/O Problem Solving |
|
|
152 | (1) |
|
|
152 | (12) |
|
|
152 | (1) |
|
|
152 | (1) |
|
|
152 | (1) |
|
|
153 | (1) |
|
|
153 | (1) |
|
|
153 | (3) |
|
|
156 | (1) |
|
6.7.8 D-Series and R-Series |
|
|
156 | (1) |
|
|
157 | (3) |
|
|
160 | (1) |
|
|
160 | (2) |
|
|
162 | (1) |
|
6.7.13 The rp7405 and rp7410 Servers |
|
|
163 | (1) |
|
|
164 | (5) |
|
|
165 | (1) |
|
|
165 | (1) |
|
6.8.3 Integrity Superdome |
|
|
166 | (1) |
|
|
167 | (1) |
|
|
167 | (2) |
Chapter 7 CPU Bottlenecks |
|
169 | (34) |
|
7.1 Processes and Threads |
|
|
170 | (5) |
|
|
173 | (1) |
|
|
173 | (2) |
|
7.1.3 Conclusions and recommendations |
|
|
175 | (1) |
|
|
175 | (11) |
|
7.2.1 Time-share Scheduling |
|
|
175 | (3) |
|
7.2.2 Real-Time Scheduling |
|
|
178 | (1) |
|
7.2.3 Process Resource Manager (PRM) and CPU Resource Allocation |
|
|
179 | (3) |
|
7.2.4 Workload Manager (WLM) |
|
|
182 | (1) |
|
|
182 | (1) |
|
7.2.6 SMP Scheduling Issues |
|
|
183 | (3) |
|
|
186 | (7) |
|
7.3.1 Hardware Partitioning |
|
|
186 | (1) |
|
7.3.2 Virtual Partitions (vPars) |
|
|
187 | (1) |
|
|
188 | (1) |
|
|
189 | (4) |
|
7.4 Traps and Protection Violations |
|
|
193 | (1) |
|
|
194 | (1) |
|
7.5.1 Global CPU Saturation Metrics |
|
|
194 | (1) |
|
7.5.2 Global CPU Queue Metrics: Run Queue vs. Load Average |
|
|
195 | (1) |
|
7.5.3 Other Global CPU Metrics |
|
|
195 | (1) |
|
7.5.4 Per-Process CPU Metrics |
|
|
195 | (1) |
|
7.6 Typical Metric Values |
|
|
195 | (2) |
|
7.7 Symptoms of a CPU Bottleneck |
|
|
197 | (2) |
|
|
197 | (1) |
|
7.7.2 Expensive CPU Utilization |
|
|
197 | (1) |
|
7.7.3 Expensive System Calls |
|
|
198 | (1) |
|
7.8 CPU Use and Performance Tools |
|
|
199 | (1) |
|
7.9 Tuning CPU Bottlenecks |
|
|
200 | (3) |
|
|
200 | (1) |
|
|
201 | (1) |
|
7.9.3 Application Optimization |
|
|
201 | (1) |
|
7.9.4 Tuning CPU-Related Parameters |
|
|
202 | (1) |
Chapter 8 Memory Bottlenecks |
|
203 | (50) |
|
8.1 Virtual Address Space |
|
|
204 | (14) |
|
8.1.1 PA-RISC Virtual Address Space Layout |
|
|
204 | (4) |
|
8.1.2 IPF Virtual Address Space Layout |
|
|
208 | (3) |
|
8.1.3 Modifying the Default Virtual Address Space Layout |
|
|
211 | (3) |
|
8.1.4 Adaptive Address Space |
|
|
214 | (2) |
|
8.1.5 Shared Memory Windows |
|
|
216 | (1) |
|
|
217 | (1) |
|
|
218 | (9) |
|
8.2.1 Variable Page Sizes |
|
|
218 | (4) |
|
|
222 | (5) |
|
|
227 | (5) |
|
|
227 | (1) |
|
|
228 | (1) |
|
|
228 | (1) |
|
|
229 | (1) |
|
8.3.5 Memory Allocation Management with PRM |
|
|
229 | (2) |
|
|
231 | (1) |
|
|
231 | (1) |
|
|
231 | (1) |
|
|
232 | (3) |
|
8.4.1 Dynamic Buffer Cache |
|
|
232 | (1) |
|
8.4.2 Memory-Mapped Files |
|
|
233 | (1) |
|
8.4.3 Memory-Mapped Semaphores |
|
|
234 | (1) |
|
8.5 Process and Thread Execution |
|
|
235 | (1) |
|
|
236 | (1) |
|
|
236 | (1) |
|
|
236 | (2) |
|
|
237 | (1) |
|
8.6.2 Environment Options |
|
|
238 | (1) |
|
8.6.3 General Malloc Issues |
|
|
238 | (1) |
|
8.7 System V Shared Memory |
|
|
238 | (2) |
|
8.7.1 Allocating Shared Memory |
|
|
238 | (1) |
|
8.7.2 Attaching to a Shared Memory Segment |
|
|
239 | (1) |
|
8.7.3 Modifying Shared Memory Attributes |
|
|
239 | (1) |
|
8.7.4 Kernel Tuneable Parameters for Shared Memory |
|
|
240 | (1) |
|
|
240 | (1) |
|
8.9 Memory Management Policies |
|
|
240 | (9) |
|
8.9.1 Regions and Pregions |
|
|
241 | (1) |
|
8.9.2 Thresholds and Policies |
|
|
242 | (1) |
|
8.9.3 Values for Memory Management Parameters |
|
|
242 | (1) |
|
8.10 Sizing Memory and the Swap Area |
|
|
242 | (2) |
|
8.10.1 Sizing the Swap Area |
|
|
243 | (1) |
|
|
243 | (1) |
|
|
244 | (2) |
|
8.11.1 Global Memory Saturation Metrics |
|
|
244 | (1) |
|
8.11.2 Global Memory Queue Metrics |
|
|
244 | (1) |
|
8.11.3 Other Global Memory Metrics |
|
|
244 | (1) |
|
8.11.4 Per-Process Memory Metrics |
|
|
245 | (1) |
|
8.11.5 Typical Metric Values |
|
|
245 | (1) |
|
8.12 Types of Memory Management Bottlenecks |
|
|
246 | (1) |
|
8.13 Expensive System Calls |
|
|
246 | (1) |
|
8.14 Tuning Memory Bottlenecks |
|
|
247 | (2) |
|
8.14.1 Hardware Solutions |
|
|
247 | (1) |
|
8.14.2 Software Solutions |
|
|
247 | (1) |
|
8.14.3 Application Optimization |
|
|
248 | (1) |
|
8.15 Memory-Related Tuneable Parameters |
|
|
249 | (4) |
|
8.15.1 Logical View of Physical Memory Utilization |
|
|
251 | (2) |
Chapter 9 Disk Bottlenecks |
|
253 | (64) |
|
9.1 Disk Hardware Descriptions |
|
|
254 | (11) |
|
9.1.1 Interface Cards (Host Bus Adapters or HBAs) |
|
|
254 | (2) |
|
|
256 | (2) |
|
|
258 | (6) |
|
9.1.4 HP Storage Works Disk Array XP-series |
|
|
264 | (1) |
|
|
264 | (1) |
|
9.2 Review of Disk I/O Concepts |
|
|
265 | (7) |
|
9.2.1 Disk Access Methods |
|
|
265 | (1) |
|
9.2.2 Unbuffered (Raw) UO |
|
|
266 | (4) |
|
|
270 | (1) |
|
9.2.4 Direct Files and Memory-Mapped Files |
|
|
270 | (2) |
|
|
272 | (1) |
|
9.3 Logical Volume Manager Concepts |
|
|
272 | (7) |
|
|
273 | (1) |
|
|
274 | (2) |
|
9.3.3 Bad Block Relocation |
|
|
276 | (1) |
|
9.3.4 Read-Only Volume Groups |
|
|
277 | (1) |
|
9.3.5 Note on Mixing RAID Parity Hardware Striping and LVM Striping |
|
|
277 | (1) |
|
|
278 | (1) |
|
|
279 | (1) |
|
9.4.1 Performance Features |
|
|
279 | (1) |
|
9.4.2 Performance Problems |
|
|
279 | (1) |
|
9.4.3 Performance Measurement |
|
|
280 | (1) |
|
9.5 Shared-Bus Access to Disks |
|
|
280 | (3) |
|
9.5.1 Single-Initiator Configurations |
|
|
280 | (1) |
|
9.5.2 Multi-Initiator Configurations |
|
|
280 | (1) |
|
9.5.3 Performance Load on Multi-Initiator Buses |
|
|
280 | (2) |
|
9.5.4 Disk Drive Write Caches |
|
|
282 | (1) |
|
9.6 File Systems and the Kernel |
|
|
283 | (4) |
|
9.6.1 I/O-Related Kernel Tables |
|
|
283 | (1) |
|
|
284 | (1) |
|
|
285 | (1) |
|
|
286 | (1) |
|
9.6.5 Dynamic Buffer Cache |
|
|
286 | (1) |
|
9.6.6 Maxusers, Maxfiles and Other Kernel Parameters |
|
|
287 | (1) |
|
|
287 | (15) |
|
|
287 | (5) |
|
|
292 | (7) |
|
|
299 | (1) |
|
|
300 | (1) |
|
|
301 | (1) |
|
9.7.6 Comparison of HFS and JFS |
|
|
302 | (1) |
|
|
302 | (4) |
|
9.8.1 Global Disk Saturation Metrics |
|
|
302 | (1) |
|
9.8.2 Global Disk Queue Metrics |
|
|
303 | (1) |
|
9.8.3 Other Global Disk Metrics |
|
|
303 | (2) |
|
9.8.4 Per-Process Disk Metrics |
|
|
305 | (1) |
|
9.8.5 Typical Metric Values |
|
|
305 | (1) |
|
9.9 Types of Disk Bottlenecks |
|
|
306 | (1) |
|
9.10 Expensive System Calls |
|
|
306 | (1) |
|
9.11 Tuning Disk Bottlenecks |
|
|
306 | (2) |
|
9.11.1 Hardware Solutions |
|
|
306 | (1) |
|
9.11.2 Configuration Solutions |
|
|
306 | (1) |
|
9.11.3 File System Solutions |
|
|
307 | (1) |
|
9.11.4 Application Solutions |
|
|
307 | (1) |
|
|
308 | (2) |
|
|
308 | (1) |
|
9.12.2 Data Placement Recommendations |
|
|
309 | (1) |
|
9.12.3 Review and Tune Database Parameters |
|
|
309 | (1) |
|
9.13 Disk-Related Tuneable Parameters |
|
|
310 | (7) |
|
|
310 | (1) |
|
9.13.2 dbc_max_pct and dbc_min_pct |
|
|
311 | (1) |
|
|
311 | (1) |
|
|
311 | (1) |
|
|
311 | (1) |
|
|
311 | (1) |
|
|
312 | (1) |
|
|
312 | (1) |
|
9.13.9 maxfiles_lim, and no_lvm_disks |
|
|
312 | (1) |
|
|
312 | (1) |
|
9.13.11 HFS-Specific Tuneable Parameters |
|
|
313 | (1) |
|
9.13.12 JFS-Specific Tuneable Parameters |
|
|
314 | (3) |
Chapter 10 Network Bottlenecks |
|
317 | (40) |
|
10.1 Networking Hardware Descriptions |
|
|
317 | (5) |
|
10.1.1 Networking Infrastructure Devices |
|
|
318 | (1) |
|
|
319 | (1) |
|
|
319 | (1) |
|
|
320 | (1) |
|
|
321 | (1) |
|
10.2 Review of Networking Concepts |
|
|
322 | (7) |
|
10.2.1 Interface Driver and DLPI |
|
|
322 | (1) |
|
|
323 | (5) |
|
|
328 | (1) |
|
|
328 | (1) |
|
|
329 | (1) |
|
10.2.6 Auto Port Aggregation |
|
|
329 | (1) |
|
|
329 | (1) |
|
|
329 | (1) |
|
|
329 | (4) |
|
|
329 | (1) |
|
|
330 | (1) |
|
|
330 | (1) |
|
|
330 | (1) |
|
|
330 | (2) |
|
|
332 | (1) |
|
10.4 Networked File System |
|
|
333 | (6) |
|
10.4.1 General NFS issues |
|
|
333 | (1) |
|
10.4.2 Client-Side Issues |
|
|
334 | (1) |
|
10.4.3 NFS Server-Side Issues |
|
|
334 | (1) |
|
10.4.4 UDP Versus TCP Issues for NFS |
|
|
335 | (1) |
|
|
336 | (1) |
|
10.4.6 Measuring NFS Performance |
|
|
337 | (1) |
|
10.4.7 Local File System Usage With NFS |
|
|
337 | (1) |
|
10.4.8 Other NFS-Related Processes |
|
|
337 | (2) |
|
|
339 | (2) |
|
10.5.1 Heartbeat Networks in a Cluster |
|
|
339 | (1) |
|
10.5.2 Client Networks in a Cluster |
|
|
340 | (1) |
|
10.5.3 SGeRAC Lock Networks |
|
|
340 | (1) |
|
|
341 | (6) |
|
10.6.1 Global Network Saturation Metrics |
|
|
341 | (1) |
|
10.6.2 Other Global Network Metrics |
|
|
341 | (1) |
|
10.6.3 Network Adapter Metrics |
|
|
342 | (1) |
|
10.6.4 Per-Process Network Metrics |
|
|
342 | (3092) |
|
|
3434 | |
|
10.7 Types of Network Bottlenecks |
|
|
347 | (1) |
|
10.8 Expensive System Calls |
|
|
347 | (1) |
|
10.9 Tuning Network Bottlenecks |
|
|
348 | (9) |
|
10.9.1 Hardware Solutions |
|
|
348 | (1) |
|
10.9.2 Configuration Solutions |
|
|
348 | (1) |
|
10.9.3 NFS File System Solutions |
|
|
348 | (1) |
|
10.9.4 Application Solutions |
|
|
348 | (1) |
|
10.10 Network-Related Tuneable Parameters |
|
|
349 | (1) |
|
10.10.1 Kernel Tuneable Parameters for Networking |
|
|
349 | (1) |
|
|
350 | (1) |
|
|
351 | (1) |
|
10.11 Web Server Tuning Issues |
|
|
352 | (1) |
|
10.11.1 Hardware Configuration |
|
|
352 | (1) |
|
|
352 | (1) |
|
|
353 | (1) |
|
10.11.4 File System Tuning |
|
|
353 | (1) |
|
10.11.5 Networking Card Tuning |
|
|
354 | (1) |
|
10.11.6 Zeus Web Server Parameters |
|
|
354 | (1) |
|
10.12 Database Server Tuning Issues |
|
|
355 | (2) |
Chapter 11 Compiler Performance Tuning |
|
357 | (50) |
|
11.1 Compilers and Optimization |
|
|
358 | (2) |
|
11.1.1 Advantages and Disadvantages of Optimization |
|
|
359 | (1) |
|
|
360 | (7) |
|
11.2.1 Level 0 Optimization |
|
|
360 | (1) |
|
11.2.2 Level 1 Optimization (+01) |
|
|
361 | (1) |
|
11.2.3 Level 2 Optimization (-0 or +02) |
|
|
361 | (4) |
|
11.2.4 Level 3 Optimization (+03) |
|
|
365 | (1) |
|
11.2.5 Level 4 Optimization (+04) |
|
|
366 | (1) |
|
11.3 Compiling for a Target Runtime Environment |
|
|
367 | (2) |
|
|
367 | (1) |
|
|
368 | (1) |
|
|
368 | (1) |
|
|
369 | (1) |
|
11.4 Finer Control Over Optimization |
|
|
369 | (20) |
|
11.4.1 Storage Optimizations |
|
|
370 | (1) |
|
11.4.2 Standards-Related Options |
|
|
371 | (2) |
|
|
373 | (3) |
|
11.4.4 Memory Latency Optimizations |
|
|
376 | (1) |
|
11.4.5 Symbol Binding Options |
|
|
377 | (4) |
|
11.4.6 Overhead Reduction Optimizations |
|
|
381 | (2) |
|
11.4.7 Threaded Application Optimizations |
|
|
383 | (2) |
|
11.4.8 Floating Point Options |
|
|
385 | (2) |
|
11.4.9 Trade-offs Between Memory Expansion and CPU Utilization |
|
|
387 | (1) |
|
11.4.10 Bundled Convenience Options |
|
|
388 | (1) |
|
11.4.11 Informative Options |
|
|
388 | (1) |
|
11.5 Linker Optimizations |
|
|
389 | (1) |
|
11.5.1 Options When Linking Shared Libraries |
|
|
389 | (1) |
|
11.5.2 Options for Setting the Maximum Virtual Page Size |
|
|
389 | (1) |
|
11.6 Profile-based Optimization |
|
|
390 | (7) |
|
|
392 | (1) |
|
11.6.2 Changing the Default flow.data File |
|
|
393 | (1) |
|
11.6.3 PA-RISC and Itanium Differences When Using PBO |
|
|
393 | (1) |
|
11.6.4 Using PBO With Shared Libraries |
|
|
394 | (1) |
|
11.6.5 Speeding Up PA-RISC PBO Compilations |
|
|
394 | (1) |
|
11.6.6 Execution Frequency Compiler Extensions |
|
|
395 | (2) |
|
11.7 Specific Options for Fortran and COBOL |
|
|
397 | (1) |
|
11.7.1 Fortran Optimizing Preprocessor |
|
|
397 | (1) |
|
11.7.2 Special COBOL Considerations |
|
|
397 | (1) |
|
11.8 Why Does Optimization "Break" Applications? |
|
|
397 | (2) |
|
11.8.1 C and ANSI C Assumptions |
|
|
398 | (1) |
|
11.8.2 Fortran Assumptions |
|
|
399 | (1) |
|
11.9 Debugging Optimization Problems |
|
|
399 | (1) |
|
11.10 Porting Applications |
|
|
400 | (1) |
|
11.10.1 Global and Static Variable Optimization |
|
|
402 | (1) |
|
|
402 | (1) |
|
11.11 Code to Demonstrate Optimization Effects |
|
|
403 | (4) |
Chapter 12 Java Run-time Performance Tuning |
|
407 | (10) |
|
|
407 | (2) |
|
|
409 | (2) |
|
12.2.1 Garbage Collection Issues |
|
|
409 | (1) |
|
12.2.2 Synchronized Keyword |
|
|
410 | (1) |
|
|
411 | (1) |
|
|
411 | (1) |
|
12.4 HP-UX Tuneable Parameters and Java |
|
|
412 | (1) |
|
12.5 Using Large Java Heap Sizes |
|
|
412 | (2) |
|
|
414 | (3) |
Chapter 13 Designing Applications for Performance |
|
417 | (18) |
|
13.1 Tips for Application Design |
|
|
418 | (1) |
|
13.2 Shared Versus Archive Libraries |
|
|
419 | (1) |
|
|
419 | (1) |
|
|
420 | (1) |
|
13.4 Choosing an Inter-Process Communication (IPC) Mechanism |
|
|
421 | (7) |
|
|
421 | (1) |
|
|
422 | (1) |
|
13.4.3 System V Semaphores |
|
|
422 | (1) |
|
13.4.4 Memory-Mapped Semaphores |
|
|
423 | (1) |
|
|
423 | (4) |
|
|
427 | (1) |
|
13.5 Trade-offs with Network System Calls |
|
|
428 | (3) |
|
13.5.1 Select and Poll System Calls |
|
|
428 | (1) |
|
|
428 | (2) |
|
13.5.3 Configuring Berkeley Sockets for Performance |
|
|
430 | (1) |
|
13.5.4 Sendfile System Call |
|
|
431 | (1) |
|
13.6 Instrumenting an Application for Performance Monitoring |
|
|
431 | (4) |
|
13.6.1 Metrics Available by Using the ARM API |
|
|
431 | (1) |
|
13.6.2 Run-time libraries |
|
|
432 | (1) |
|
13.6.3 Calls Available With the ARM API |
|
|
432 | (1) |
|
13.6.4 Using the ARM API to Measure Application Performance |
|
|
433 | (2) |
Chapter 14 Application Profiling |
|
435 | (34) |
|
|
436 | (18) |
|
14.1.1 Analyzing a Program With Caliper |
|
|
436 | (6) |
|
14.1.2 l Using Caliper to Track Down the Performance Issues |
|
|
442 | (5) |
|
14.1.3 Attaching to a Running Process |
|
|
447 | (1) |
|
14.1.4 Profiling Multiple Processes |
|
|
447 | (2) |
|
14.1.5 Making Your Own Configuration Files |
|
|
449 | (1) |
|
14.1.6 Instrumenting Sections of an Application With Caliper |
|
|
449 | (3) |
|
14.1.7 Detailed Information on Caliper Configuration Files |
|
|
452 | (1) |
|
|
452 | (1) |
|
14.1.9 Caliper Output Options |
|
|
453 | (1) |
|
14.1.10 Advantages and Disadvantages of Caliper |
|
|
453 | (1) |
|
|
454 | (1) |
|
14.2.1 Steps in Using CXperf |
|
|
|
|
459 | (1) |
|
14.2.3 Advantages and Disadvantages of CXperf |
|
|
459 | (1) |
|
|
460 | (1) |
|
14.3.1 Step 1: Special Compilation and Linking |
|
|
460 | (1) |
|
14.3.2 Step 2: Program Execution |
|
|
462 | (1) |
|
14.3.3 Step 3: Profile Generation |
|
|
463 | (1) |
|
|
464 | (1) |
|
|
465 | (4) |
Appendix A Performance Tools Alphabetical Reference |
|
469 | (34) |
Appendix B HP-UX Version Naming Reference |
|
503 | (2) |
Appendix C Dynamically Tuneable Parameters |
|
505 | (6) |
Index |
|
511 | |