

# Thermal-Aware System-Level Design of 2D/3D MPSoC Architectures

David Atienza Embedded Systems Laboratory (ESL), Ecole Polytechnique Fédérale de Lausanne (EPFL)



© ESL/EPFL 2011



**Evolution to Multi-Processor** System-on-Chip (MPSoC)

Roadmap continues:  $90 \rightarrow 65 \rightarrow 45 \rightarrow 32$  nm

PE

**SRAM** 

Multi-Processor System-on-Chip (MPSoC) architectures are a reality for a while...

I/0

I/O

PE

I/O



SRAM

CMOS

Ε

R

A

S



memory



Evolution to Multi-Processor System-on-Chip (MPSoC)

- Roadmap continues:  $90 \rightarrow 65 \rightarrow 45 \rightarrow 32$  nm
- Multi-Processor System-on-Chip (MPSoC) architectures are a reality for a while...



[Cell Multi-Processor – PS3]





CMOS

90nm CMOS

CMOS

© ESL/EPFL 2011



# MPSoCs are Spreading Fast





# **Design Issues in MPSoCs**

- MPSoCs have very complex architectures
  - Advanced components and CAD tools very expensive
  - Time-closure issues, system speed decreased
- Aggravated thermal issues: Thermal-Aware MPSoC Layouts
  - Hot-spots, non-uniform thermal gradients





### Advocating Thermal-Aware 2D/3D MPSoC Design

Integration of HW/SW modeling and management





# MPSoC Thermal Modeling Problem:

- MPSoC Modeling and Exploration
  - SW simulation: Transactions, cycle-accurate (~100 KHz) [Synopsys Realview, Mentor Primecell, Madsen et al., Angiolini et al.]

At the used cycle-accurate level, they are too slow for thermal analysis of real-life applications!

- Heat Flow Modeling
  - Finite-Element simulation [COMSOL Multiphysics [FEMLAB]

# Too computationally intensive and very complex to tune in MPSoC with limited set of sensing components!

 High-order RC-level heat flow models [Hotspot, Link et al.]

Not close-loop interaction at run-time with inputs from MPSoC components!









# MPSoC Thermal Modeling Problem:

- MPSoC Modeling and Exploration
  - SW simulation: Transactions, cycle-accurate (~100 KHz) [Synopsys Realview, Mentor Primecell, Madsen et al., Angiolini et al.]

At the used cycle-accurate level, they are too slow for thermal analysis of real-life applications!

- Heat Flow Modeling
  - Finite-Element simulation

Request: Fast (and relatively accurate) thermal model for MPSoCs that enables close-loop run-time interaction

 High-order RC-level heat flow models [Hotspot, Link et al.]

Not close-loop interaction at run-time with inputs from MPSoC components!



TZZZO



# **RC-Based Thermal Modeling for MPSoC**

#### Model interface

- Input: power model of tier components, geometrical properties
- Output: temperature of tier components at run-time
- Thermal circuit: 1<sup>st</sup> order RC circuit
  - Heat flow ~ Electrical current ; Temperature ~ Voltage
  - Metal and Si layers composed of elementary blocks







# **RC-Based Thermal Modeling for MPSoC**

#### Model interface

- Input: power model of tier components, geometrical properties
- Output: temperature of tier components at run-time
- Thermal circuit: 1<sup>st</sup> order RC circuit
  - Heat flow ~ Electrical current ; Temperature ~ Voltage
  - Metal and Si layers composed of elementary blocks





# RC-Based Thermal Modeling for MPSoC

#### Model interface

- Input: power model of tier components, geometrical properties
- Output: temperature of tier components at run-time
- Thermal circuit: 1<sup>st</sup> order RC circuit
  - Si thermal conductivity Heat flow ~ Electrical current ; Temp dependent on temperature
  - Metal and Si layers composed of ele





# Discrete RC-Thermal Estimation Tool for tiers of 3D Chips

$$C \dot{t}_{k} = -G (t_{k})t_{k} + p_{k} ; k = 1..m$$

- Creating linear approximation while retaining variable Si thermal conductivity:
  - Si thermal conductivity linearly approx. :  $G_{i,i}(t_k) = I + q t_k$
  - Numerically integrating in discrete time domain the t<sub>k</sub> : ≨

$$t_{k+1} = A(t_k)t_k + Bp_k$$
; k = 1..m  
A(t\_k) = (I (d\_t C^{-1}G(t\_k)); B = d\_t C^{-1}

Time step chosen small enough for convergence





# Discrete RC-Thermal Estimation Tool for tiers of 3D Chips

$$C \dot{t}_{k} = -G (t_{k})t_{k} + p_{k} ; k = 1..m$$

- Creating linear approximation while retaining variable Si thermal conductivity:
  - Si thermal conductivity linearly approx. :  $G_{i,i}(t_k) = I + q t_k$
  - Numerically integrating in discrete time domain the t<sub>k</sub>:

$$t_{k+1} = A(t_k)t_k + Bp_k$$
; k  
 $A(t_k) = (I - d_tC^{-1}G(t_k))$ ; B

Complexity scales linearly with the number of modeled cells (simulated on Xeon Server)

Thermal library validated against finite element model (IMEC and EPFL)





# **MPSoC Thermal Library Validation**

- Extensible set of layers in MPSoC designs
  - Pre-defined material layers and components: Silicon, copper (10 layers), packaging, interposer, bumps, etc.
- Configurable nr. of cells and iterations per tier
  - Initially 10ms thermal interval (1000 iterat./tier)
- Test chips manufactured at EPFL:
  - Three types of layouts









PADs



### **Correlation Results: Intra-Tier Heat Transfer**

121

- Lateral heat flow
  - Tested range: 0.5W to 10W per heater
  - Similar accuracy results at different tiers
- Measurements/simulations in case ~9W:





### Correlation Results: Intra-Tier Heat Transfer

- Lateral heat flow
  - Tested range: 0.5W to 10W per heater
  - Similar accuracy results at different tiers
- Measurements/simulations in case ~9W:



Variations of less than 1.5% between 3D chip measurements and RC-based 3D thermal model



same point (Max temp)





# Temperature Management is Power Control under Thermal Constraints

- Power consumption of cores determines thermal behavior
  - Power consumption depends on frequency and voltage
  - Setting frequencies/voltages can control power and temperature
- Optimization problem: frequency/voltage assignment in MPSoCs under thermal constraints
  - Meet processing requirements
  - Respect thermal constraint at all times
  - Minimize power consumption





# Thermal Management: Initial Thoughts

- Static approach: thermal-aware placement to try to even out worst-case thermal profile [Sapatnekar, Wong et al.]
  - Computationally difficult problem (NP-complete)

Not able to predict all working conditions, and leakage changing dynamically, not useful in real systems



No formalization of thermal optimization problem!

- Dynamic approach: HW-based dynamic thermal management
  - Clock gating based on time-out [Xie et al., Brooks et al.]
  - DVFS based on thresholds [Chaparro et al, Mukherjee et al,]
  - Heuristics for component shut down, limited history [Donald et al]

Techniques to minimize power, they only achieve thermal management as a by-product...





Obse

– H

# Formalization of Thermal Management Problem in MPSoCs

- Control theory problem
  - Optimal frequency assignment module, 2-phase approach:
  - 1) Design-time phase: Find optimal sets of frequencies for the cores for different working conditions
  - P
    Cont
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P
    P</li
    - Tuning knobs: trequencies/voltages of the system





Pro-Active Based Thermal Control: Phase 1 – Design-Time

 Predictive model of thermal behavior given a set of frequency assignments





Pro-Active Based Thermal Control: Phase 1 – Design-Time

 Predictive model of thermal behavior given a set of frequency assignments





### Making Power and Thermal Constraints Convex

- Power constraint adaptation
  - Change non-affine (quadratic equality):

 $p_{max} (f_{i,k})^2 / (f_{max})^2 = p_{i,k}; i = 1,..,n, \forall k$ 

To convex inequality:

 $p_{max} (f_{i,k})^2 / (f_{max})^2 \le p_{i,k}; i = 1,...,n, \forall k$ 

- Thermal constraint adaptation
  - Use worst case thermal conductivity in the range of allowed temperatures, and iterate (if needed) to optimum





### Making Power and Thermal Constraints Convex

- Power constraint adaptation
  - Solve convex problem and get table of optimal frequencies for different working conditions in polynomial time (number of processors)



00



Targeted operating

**Pro-Active Based Thermal Control:** Phase 2 - Run-Time, Putting It All Together

Current temperature of cores

Use table of frequencies assignments and index by actual conditions at regular run-time intervals





## Case Study: 8-Core Sun MPSoC

- MPSoC Sun Niagara architecture
  - 8 processing cores SPARC T1
- Max. frequency each core: 1 GHz
  - 10 DVFS values, applied every 100ms
- Max. power per core: 4 W
- Execution characteristics of workloads [Sun Microsystems]:
  - Mixes of 10 different benchmarks, from web-accessing to multimedia
  - 60,000 iterations of basic benchmarks, tens of seconds of actual system execution



Sun's Niagara MPSoC



### Thermal Constraints Respected... And Faster overall!



# Proposed method achieves better throughput than standard DVFS while satisfying thermal constraints





#### The Big Picture for Thermal-Aware Design: Large-Scale Computing in Datacenters

Area is expensive, we try to get denser infrastructures
 New containers: many more servers each, >10x density



45% of energy overhead in cooling, how to get higher computational densities (with lower cooling costs)?

- Air-cooled datacenters are very inefficient
  - Cooling needs as much energy as IT... and thrown-away
- For a 10MW datacenter
  ~US\$ 4M wasted per year





## Processing Trends: Single to Multi-Core



[Courtesy: Yuan Xie, ICCAD 2010]

© ESL/EPFL 2011

←13.75 mm →



# Why not using 3<sup>rd</sup> Dimension?



© ESL/EPFL 2011

## Run-Time Heat Spreading in 3D MPSoCs: More Complex Cooling Needs!

5-tier 3D stack: 10 heat sources and sensors



[Garcia and Atienza, Microelectornics Journal 2010]

5<sup>th</sup> Tier

© ESL/EPFL 2011



Laver 2



Zero-Emission Datacenter: Liquid Cooling **Technology and Predictive Energy Management** 

- Datacenters are "intelligent" heaters
  - 30-40% of carbon footprint in Europe using district heating networks
- Direct re-use of heat output
  - 3D MPSoC architectures

Aquasar datacenter server: 80% payback of electricity costs



31



Zero-Emission Datacenter: Liquid Cooling **Technology and Predictive Energy Management** 

- Datacenters are "intelligent" heaters
  - 30-40% of carbon footprint in Europe using district heating networks
- Direct re-use of heat output
  - 3D MPSoC architectures

Aquasar datacenter server: 80% payback of electricity costs





Zero-Emission Datacenter: Liquid Cooling Technology and Predictive Energy Management

- Datacenters are "intelligent" heaters
  - 30-40% of carbon footprint in Europe using district heating networks
- Direct re-use of heat output
  - 3D MPSoC architectures

Aquasar datacenter server: 80% payback of electricity costs





## NanoTera CMOSAIC Project: Design of 3D MPSoCs with Advanced Cooling

- 3D systems require novel electro-thermal co-design
  - Academic partners: EPFL and ETHZ
  - Industrial: IBM Zürich and T.J. Watson

3D MPSoC datacenter chip: microchannels etched on back side to circulate (controlled) liquid coolant





#### Creating a Fast Thermal Model: Compact RC-Based Stack Model with TSVs





# 3D MPSoC Thermal Library Deployment

- Extensible set of layers in 3D stack
  - Up to 9 tiers and heat spreader
  - Pre-defined layers:
    - Silicon, copper (10 layers), glue, overmold, interposer, bump
- Configurable nr. of cells and iterations per tier
  - Also 10ms thermal interval (1000 iterat./tier)
- Multi-tier test chip manufactured at EPFL:









© ESL/EPFL 2011



## **Correlation Results:** Inter-Tier Heat Transfer

- Vertical heat flow (multi-level measurements)
  - Tested range: 0.5W to 10W per heater
  - Variations only of global temperatures trend





# Correlation Results: Inter-Tier Heat Transfer

- Vertical heat flow (multi-level measurements)
  - Tested range: 0.5W to 10W per heater
  - Variations only of global temperatures trend
  - Tior 2 massurements/simulations (=0\//).



Variations of approximately 5% between 3D chip measurements and RC-based 3D thermal model





# Modeling Liquid Cooling as RC-Network in 3D MPSoC stacks

- Local junction temperature modeled as 4-resistor based compact transient thermal model (4RM-based CTTM)
  - Rtot = Rcond + Rconv + Rheat





# Manufacturing of 5-Tier 3D Chips with Liquid Channels in Multiple Tiers



Adding multi-tier liquid cooling in-/out-lets







# Manufacturing of 5-Tier 3D Chips with Liquid Channels in Multiple Tiers





# Manufacturing of 5-Tier 3D Chips with Liquid Channels in Multiple Tiers

Maximum error of 3.5% between measurements and RC-based 3D thermal model with liquid cooling





# Active-Adapt3D: Active cooling management for 3D MPSoCs

- 3D MPSoC temperature control at system-level:
  - **Electrical based**: task scheduling, and DVFS (µsec or few ms)
  - Mechanical based: run-time varying flow rate (hundreds of ms)
- Fuzzy logic-based controller and thermal-aware scheduler
  - 1. Design-time analysis: extraction of set of thermal management rules
  - Run-time thermal management: utilization of rules in scheduler and subsequently fuzzy logic controller using both mechanical and electrical methods to achieve:
    Inputs:





# Active-Adapt3D: Active cooling management for 3D MPSoCs



and task assignment for 3D MPSoCs!





#### Temperature-Aware Load Balancing (TALB) Scheduler for 3D MPSoCs





# Integrated Flow Rate and DVFS Fuzzy Controller for 3D MPSoCs





# Experiments Active Thermal Management 3D MPSoCs with Microchannels

- Target 3D systems based on 3D ICs with Sparc-Power cores
  - Power values and workloads from real traces measured in Sun platforms (database queries, web services, etc.)
- Cores and caches in separate layers
  - 3D crossbar as interconnect
- Channels:
  - Width 100µm and height 50µm
  - Three flow rate settings, default at 32ml/min





# Run-time thermal Management for 3D Chips: thermal evaluation

- For hot spot threshold 85°C, thermal violations: 0%
- Energy reduction:
  - **70%** average coolant energy (**max. savings: 77%**)
  - 52% average total system energy (max. savings: 85%)



Promising figures for thermal control in 3D MPSoCs, thermal gradients of less than five degrees/tier



# Conclusions: Aquasar 2010 First Chip-Level Liquid Cooled Server

- MPSoCs: Interdisciplinary work
  - Fast RC thermal models for 2D/3D MPSoC with inter-tier variable liquid fluxes (less than 5% error)
  - Layout combining electrical and mechanical constraints-modeling
- Next generation of thermal-aware proactive controllers (task control, flow rate and DVFS)
  - Holistic control reduces significantly the thermal issues and improves energy cost (80% energy savings)
  - "Green" datacenters: energy efficient
    - Roadrunner: 445 Mflops/Watt
    - Aquasar: 2250 MFlops/Watt



#### Back side Wate Courtesy: IBM Zürich

Water conditioning



#### **References and Bibliography**

- 2D and 3D Thermal modeling
  - "HW-SW Emulation Framework for Temperature-Aware Design in MPSoCs", D. Atienza, et al. ACM TODAES, August 2007
  - "3D-ICE: Compact transient thermal model for 3D ICs with liquid cooling via enhanced heat transfer cavity geometries", A. Sridhar, et al. *Proc. of ICCAD 2010*, USA, November 2010 (<u>http://esl.epfl.ch/3d-ice.html</u>).
  - **"Emulation-based transient thermal modeling of 2D/3D systems-on-chip with active cooling"**, Pablo G. Del Valle, David Atienza, *Elsevier Microelectronics Journal*, December 2010.
  - "Compact transient thermal model for 3D ICs with liquid cooling via enhanced heat transfer cavity geometries", A. Sridhar, et al., *Proc. of THERMINIC 2010*, Spain, October 2010.
  - **"Fast Thermal Simulation of 2D/3D Integrated Circuits Exploiting Neural Networks and GPUs"**, A. Vincenzi, et Al., *Proc. of ISLPED 2011*, Japan, August 2011.



## **References and Bibliography**

- Thermal management for 2D MPSoCs
  - **"Thermal Balancing Policy for Multiprocessor Stream Computing Platforms"**, F. Mulas, et al., *IEEE T-CAD*, December 2009.
  - **"Processor Speed Control with Thermal Constraints"**, A. Mutapcic, et al., *IEEE TCAS-I*, September 2009.
  - "Online Convex Optimization-Based Algorithm for Thermal Management of MPSoCs", F. Zanini, et al., Proc. of *GLSVLSI 2010*, USA, May 2010.
  - **"Temperature Control of High-Performance Multi-core Platforms Using Convex Optimization"**, S. Murali, et al., Proc. of DATE 2008, Germany, March 2008.
  - "A Control Theory Approach for Thermal Balancing of MPSoC", F. Zanini, et al., *Proc. of ASP-DAC 2009*, Japan, January 2009.
  - **"Multicore Thermal Management with Model Predictive Control"**, F. Zanini, et al., Proc. of ECCTD 2009, Turkey, August 2009.



#### **References and Bibliography**

- Thermal management for 3D MPSoCs
  - **"3D Stacked Systems with Active Cooling: The Road to Single-Chip High-Performance Computing**", Ayse K. Coskun, et al., *IEEE MIC*RO, August/September 2011.
  - "Hierarchical Thermal Management Policy for High-Performance 3D Systems with Liquid Cooling", F. Zanini, et al., IEEE JETCAS, September 2011.
  - "Fuzzy Control for Enforcing Energy Efficiency in High-Performance 3D Systems", M. Sabry, et al., *Proc. of ICCAD 2010*, USA, November 2010.
  - **"Energy-Efficient Variable-Flow Liquid Cooling in 3D Stacked Architectures"**, Ayse K. Coskun, et al., *Proc. of DATE 2010*, Germany, March 2010.
  - "Modeling and Dynamic Management of 3D Multicore Systems with Liquid Cooling", Ayse K. Coskun, et al., *Proc. of VLSI-SoC 2009*, Brazil, October 2009. (Best Paper Award)
  - **"Dynamic Thermal Management in 3D Multicore Architectures"**, Ayse K. Coskun, et al., *Proc. of DATE 2009*, France, April 2009.





# QUESTIONS ?



Nano-Tera.ch Swiss Engineering Programme



Swiss National Science Foundation



European Commission

Acknowledgements: LSM and LTCM-EPFL, ECE-Boston University







**IBM Zürich**