56
Apresentações AMD 19 de janeiro 10:00 – 12:00 – O Futuro das GPUs 20 de janeiro 10:00 – 12:00 – Computação acelerada Roberto Brandão AMD Latin America

Amd future of gp us - campus party

Embed Size (px)

Citation preview

Page 1: Amd   future of gp us - campus party

Apresentações AMD

19 de janeiro 10:00 – 12:00 – O Futuro das GPUs

20 de janeiro 10:00 – 12:00 – Computação acelerada

Roberto Brandão

AMD Latin America

Page 2: Amd   future of gp us - campus party

The Future of GPURoberto Brandão

AMD Latin America

Page 3: Amd   future of gp us - campus party

Today’s GPUs focused on

GAMING

ENTERTAINMENT

PRODUCTIVITY

Page 4: Amd   future of gp us - campus party

Today’s GPUs focused on

GAMING

ENTERTAINMENT

PRODUCTIVITY

Page 5: Amd   future of gp us - campus party

DirectX® 11 Tessellation

Images courtesy of Unigine Corp.

No Tessellation Tessellation

DirectX® 10 DirectX® 11

5

Page 6: Amd   future of gp us - campus party

DirectX® 11 Multi-Threading

Application, DirectX runtime, and DirectX driver can each run in separate threads

Tasks like loading a texture or compiling a shader can execute in parallel with main rendering thread

DirectX® 10 DirectX® 11

6

Page 7: Amd   future of gp us - campus party

DirectX® 11 Tessellation

Images courtesy of Unigine Corp.

No Tessellation Tessellation

DirectX® 10 DirectX® 11

7

Page 8: Amd   future of gp us - campus party

DirectX® 11 Tessellation

Images courtesy of Unigine Corp.

No Tessellation Tessellation

DirectX® 10 DirectX® 11

8

Page 9: Amd   future of gp us - campus party

Order Independent Transparency (OIT)

Efficient rendering of many overlapping transparent objects

Smoke, fire, hair, foliage, fences, water, glass

Rendering transparent objects correctly requires sorting

Blending is an order dependent operation

DirectCompute 11 simplifies OIT by sorting transparent pixels in one shader pass

Uses atomic operations and append buffers

9

Page 10: Amd   future of gp us - campus party

DirectX® 11 OIT in Action

Order-Independent Transparency

Simple Alpha Blending

Skeletonexposed

Arm bleedsthrough body

10

Page 11: Amd   future of gp us - campus party

Render Post-Processing

Apply filter kernel to every pixel in rendered image Depth of field, motion blur, tone mapping, edge detection, smoothing, sharpening

Requires data from neighbouring pixels

Example: constant time filter spreading Accurately simulates certain lens

effects such as depth of field

Novel processing techniquedeveloped at AMD in conjunctionwith UC Berkeley

DirectCompute greatly simplifiesimplementation while increasingperformance and visual fidelity

– Alpha buffer tricks no longer needed –fewer artifacts

– Shared memory optimizations –better performance

11

Page 12: Amd   future of gp us - campus party

DirectX® 11 Depth of Field in Action

Filter Spreading

Legacy Method

Noticeable halosHard silhouette

12

Page 13: Amd   future of gp us - campus party

Shadow Rendering

HDAO (High Definition Ambient Occlusion)

Detects “valleys” in scene geometry and darkens them according to depth

Contact hardened shadows

Sharpens shadow edges where they contact casting object, make edges increasingly blurry as they get farther away

13

Page 14: Amd   future of gp us - campus party

DirectX® 11 Shadows in Action

DirectX 10.1 Shadows

Images from S.T.A.L.K.E.R.: Call of Prypiat (GSC Gameworld)

DirectX 11Contact Hardened Shadows

14

Page 15: Amd   future of gp us - campus party

Lighting Post Effects

Realistic Nighttime lighting

HDR bloom

Lens flare

Atmospheric scattering

Light trails

3D color grading

Motion blur

Page 16: Amd   future of gp us - campus party

Anti-Aliasing

Smoothes jagged edges around objects

More obvious in moving images and at lower resolutions

Takes multiple samples of image

More samples = higher quality, but also much more work

Radeon products support 2x, 4x, and 6x sample modes

No AA

2x AA

4x AA

6x AA

Page 17: Amd   future of gp us - campus party

EQAA Modes

= Color Sample Location

= Coverage Sample Location

= Pixel Boundary

2x MSAA

2x EQAA4 coverage samples

4x MSAA

4x EQAA8 coverage samples

8x MSAA

8x EQAA16 coverage samples

No AA

Page 18: Amd   future of gp us - campus party

Tessellating the Right Way

Can add significant detail to a scene while effectively compressing geometry

But excessive use of tessellation can be inefficient for today’s GPUs Poor utilization of rasterizers Overshading Too many polygon edges for MSAA

Brute force approach is wasteful

8

7

6

5

4

3

2

1 25 pixel triangles

15 pixel triangles

5 pixel triangles

1 pixel triangles

Overshade per pixel

16 pixel triangle100% rasterizer

utilization

1 pixel triangle6.25% rasterizer

utilization

Page 19: Amd   future of gp us - campus party

Morphological Anti-Aliasing

Post-process filtering technique accelerated with DirectCompute

Delivers full-scene anti-aliasing

Not limited to polygon edges, alpha-tested surfaces, etc.

Faster than super-sampling

Performance similar to edge-detect CFAA, but applies to all edges

Compatible with any DirectX® 9/10/11 application

Including games with no AA support

Enabled via AMD Catalyst Control Center™

No AA Morphological AA

Images captured from Aliens vs. Predator by Rebellion

Page 20: Amd   future of gp us - campus party

Morphological Anti-Aliasing

No AA 4xMSAA MLAA MSAA + MLAA

Page 21: Amd   future of gp us - campus party

Today’s GPUs focused on

GAMING

ENTERTAINMENT

PRODUCTIVITY

Page 22: Amd   future of gp us - campus party

Power savings improvment

Page 23: Amd   future of gp us - campus party

We are visual beings

23

Consumers are looking for better visual experience in an evironment with variable content

Content formats and sources have more diversity than ever

New applications will demand for computing power that is impossible on today’s hardware

Words are processedat only 150 words

per minute

Verbal perception

Pictures and video are processed 400 to

2000 times faster

Visual perception

Page 24: Amd   future of gp us - campus party

Enhanced Multimedia Capabilities

Windows® Aero ModePlayback of HD videos in high qualitywith Windows® Aero mode enabled10

Enhanced UVD2Hardware acceleration decodeof dual 1080p HD video streams9

Video GammaIndependent from Windows® desktopfor a superior user experience

Brighter Whites“Blue Stretch” processing increases theblue value of white colors for brighter videos

Dynamic Video RangeControl of levels of black and white during playback

Power ManagementEnables new customers for all levels of graphics

24

Page 25: Amd   future of gp us - campus party

Superior HDMI Audio and Video Features

Enhanced Home Theatre Audio Experience HDMI 1.3a Dolby TrueHD & DTS-HD Master Audio Full support for premium Blu-ray audio formats

Dolby TrueHD , DTS-HD Master Audio, AC-3 and DTS

High quality surround soundUp to 8 channels of 192kHz / 24-bit audio

Advanced Display Quality

HDMI 1.3a Deep Color & x.v.Color Over 1 billion colors output through HDMI

12-bpc output, 10-bpc (4:4:4) meaningfully derived11

Wide range of colorsFull support for wide-gamut x.v. color video signals

25

Page 26: Amd   future of gp us - campus party

Improvements already reached consumers

0%

10%

20%

30%

40%

50%

60%

70%

80%

Processor utilization

ATI Stream

Adobe Flash plugin used by Youtube.com Better image quality and video smoothness Lower processor usage

Page 27: Amd   future of gp us - campus party

Convert your DVD videos into near HD quality with DVD Upscaling

Designed to help dramatically

improve the quality of your movies

Take Your DVD’s to Near HD Quality

Page 28: Amd   future of gp us - campus party

Better video quality from a DVD (DVD Upscaling)

Better definition and sharpness of video streams based on MPEG-2 (DVD) for high definition displays

DVD Upscaled DVD

Page 29: Amd   future of gp us - campus party

Dramatically Improve Online Video Quality

Watch online videos with smooth playback and sharper, vibrant image quality

Make online video come to life!

Page 30: Amd   future of gp us - campus party

Today’s GPUs focused on

GAMING

ENTERTAINMENT

PRODUCTIVITY

Page 31: Amd   future of gp us - campus party

Introducing Next-Gen Desktop Configurations for …

DCC

CAD

Image courtesy of StudioGPU

Image courtesy Todd Daniele

Driver version 8.66 (ATI Catalyst™ 9.10) or above is required to support ATI Eyefinity technology. To enable a third display requires one panel with a DisplayPort connector.

Page 32: Amd   future of gp us - campus party

Introducing Next-Gen Desktop Configurations for …

Oil & Gas

Medical

Image courtesy Barco Medical SystemsDriver version 8.66 (ATI Catalyst™ 9.10) or above is required to support ATI Eyefinity technology. To enable a third display requires one panel with a DisplayPort connector.

Page 33: Amd   future of gp us - campus party

6x1 Portrait Display Group

3x1 Landscape Display Group

3x1 Landscape Display Group

Plus 3 Extended

3x1 Display Group Plus 1 Extended

1x3 Portrait Display Group

Maximum Flexibility in Display Configuration*

Screen Images courtesy Todd Daniele

Screen Image courtesy Todd Daniele

Screen Images courtesy Todd Daniele

Screen Images courtesy University of Hertforshire

Image courtesy University of Hertfordshire

Page 34: Amd   future of gp us - campus party

Single GPU 4K Output for CAD and DCC*

Image courtesy University of Hertfordshire, D.Atkins

Image courtesy Todd Daniele *Planned features, specifications, and/or capabilities of top sku of upcoming ATI FirePro™ professional graphics cards.  Subject to change without notice.

Page 35: Amd   future of gp us - campus party

Distinctive Features - High Quality Rendering

Full 30-bit display pipeline produces more than one billion colors and enables you to see more of your data*

Up to 1600 Stream Processors enable you to push visual effects farther than ever before

Images courtesy Barco Medical Systems

Images courtesy Studio GPU

* Requires 30-bit monitor for true 30-bit color display.

Page 36: Amd   future of gp us - campus party

AMD Support for 30-bit Color* in Adobe® Photoshop®

8-bit per color component16.7 million colors**

10-bit per color componentOver 1 billion colors**

* Requires 30-bit monitor for true 30-bit color display. **Simulated images.

Page 37: Amd   future of gp us - campus party

AMD Stream TechnologyUsing the GPU to Enhance the Notebook PC Experience

Gaming Entertainment Productivity

Developers leverage AMD GPUs and CPUsfor enhanced application performance and user experience

Industry-standard OpenCL™and DirectCompute 11 enablecross-platform development

Massively parallel, programmable GPU architecture enables dramatic performance and power efficiency

Balanced Platform

Open Standards

Performance and Battery Life

* ATI Stream technology requires both enabled graphics and an enabled application

37

Page 38: Amd   future of gp us - campus party

ATI Stream-Enabled Applications & Games

MediaShow 5MediaShow EspressoPowerDirector 8PowerDirector 7

SimHD™ Plug-infor TotalMedia Theatre

Roxio Creator™ 2010Roxio Creator™ 2010 Pro

Aliens vs, PredatorSTALKER Call of PripyatDiRT 2

38

Page 39: Amd   future of gp us - campus party

Using fourCPU Cores

Frames Frames

CPU Usage: 100%

GPU Usage: 1%

Video Transcoding SampleNo GPU Acceleration

CPU Usage: 100% Time to finish: 1h 52m Total Power: 0.23kW/h

GPU Usage: 1% Peak power: 145W Energy Price: $0.1539

Page 40: Amd   future of gp us - campus party

Frames Frames

CPU Usage: 45%

GPU Usage: 35%

Video Transcoding SampleATI GPU Acceleration

CPU Usage: 45% (100%) Time to finish: 26m (1h52m) Total Power: 0.11kW/h (0.23)

GPU Usage: 35% (1%) Peak power: 198W (145W) Energy Price: $0.07 ($0.15)

Using hundreds ofStream Processors

ControlControl

40

Page 41: Amd   future of gp us - campus party

CONECTIVITY

Page 42: Amd   future of gp us - campus party

Get immersed with AMD Eyefinity

technology

Get amazing Eye-Definition graphics

with DirectX® 11

Get fast applications and incredible video with

AMD EyeSpeed technology

2 x miniDPDesigned for Displayport 1.2

HDMI 1.4a 2 x DVI (DL-DVI+ SL-DVI)

AMD Radeon™ HD 6000 Series Graphics

Page 43: Amd   future of gp us - campus party
Page 44: Amd   future of gp us - campus party

POWER MANAGEMENT

Page 45: Amd   future of gp us - campus party

Performance por watt

• US datacenters consume more power than five 1000 megawatt nuclear power plants – at a cost of almost $3 billion

• This is 150% more than the consumption in 2001

Page 46: Amd   future of gp us - campus party

Power savings improvment

Page 47: Amd   future of gp us - campus party

AMD PowerTune Technology

Clamps GPU TDP to a pre-determined level

Integrated control processor monitors GPU activity real time

GPU includes counters across all blocks which are monitored and applied to an algorithm to infer power draw

Dynamically adjusts clock to enforce TDP

Provides direct control over GPU power draw (as opposed to indirect via clock/voltage tweaks)

Algorithmic approach guarantees consistent performance across each product variant

No longer need to constrain default clock speeds to allow for outlier applications

User controllable via AMD OverDrive Utility

Page 48: Amd   future of gp us - campus party

PowerTune – Game Power Draw

Games consistently operate at lower power than peak apps

With PowerTune, each product variant is tuned to maximize game performance

Outlier applications are still handled gracefully

Accommodates future application power draw

Lost Planet DX10

Crysis DX10

Resident Evil 5 DX10

Battle-forge

DX10.1

Furmark 1.65

3DMark 03 GT4

Perlin Noise

OCCT SC8100

110

120

130

140

150

160

170

180

190

200

Max Total ASIC Power (W)

Page 49: Amd   future of gp us - campus party

AMD PowerTune Technology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31300

400

500

600

700

800

900

0

50

100

150

200

250

300

AMD Radeon™ HD 69503DMark Vantage: Perlin Noise

GPU Core Clock [MHz] FPS

Time Sec(s)

GPU

Core

Clo

ck (

MH

z)

FPS

Page 50: Amd   future of gp us - campus party

THE FUTURE OF GPUs

Page 51: Amd   future of gp us - campus party

The future of GPUs

More performance

Better power management

GPU Everywhere

Page 52: Amd   future of gp us - campus party

One Design, Fewer Watts, Massive Capability

Discrete-level DirectX® 11

GPU

“Zacate” AMD

Fusion APU

75 sq. mm 18 watts

NorthbridgeDual-Core

CPU+ + =

66 sq. mm 13 watts

117 sq. mm 25 watts

59 sq. mm 8 watts

Page 53: Amd   future of gp us - campus party

Graphics and Media Processing Efficiency Improvements

CPU Cores

GPU UVD

SB Functions

~7 GB/sec

~17 GB/sec

UNB

MC

~17 GB/sec

DDR3 DIMMMemory

CPU Chip

PCIe

Bandwidth pinch points and latency hold back the GPU capabilities

3X bandwidth between GPU and memory

Even the same sized GPU is substantially more effective in this configuration

Eliminate latency and power associated with the extra chip crossing

Substantially smaller physical foot print

Graphics requires memory bandwidth

to bring full capabilities to life

~27 GB/sec

~27 GB/sec

DDR3 DIMMMemory

APU Chip

PCIe

2010 IGP-based Platform 2011 APU-based Platform

GPU

CPU Cores

UVD

UN

B / M

C

Page 54: Amd   future of gp us - campus party

“Ontario” & “Zacate” Architecture

APU>2 x86 CPU Cores (40nm “Bobcat” core – 1 MB L2,

64-bit FPU)>C6 and power gating>Array of SIMD Engines

• DX11 graphics performance• Industry leading 3D and graphics processing

>3rd Generation Unified Video Decoder>H.264, VC1, DixX/Xvid format

>DDR3 800-1066, 2 DIMMs, 64 bit channel>BGA package

Display and I/O>Two dedicated digital display interfaces

• Configurable externally as HDMI, DVI, and/or Display Port

• Also supports a single link LVDS for internal panels

>Integrated VGA>5x8 PCIe® > “Hudson” Fusion Controller Hub

Page 55: Amd   future of gp us - campus party

Summary

More realistic graphics

Perfect power management and energy efficiency

Used by all kind of applications

55

Everywhere