# Bounding DRAM Interference in COTS Heterogeneous MPSoCs for Mixed Criticality Systems

**Mohamed Hassan** and Rodolfo Pellizzoni



**FERLOO** 





 $\begin{pmatrix} 1 \end{pmatrix}$ 

### Outline

 Emerging Systems No longer solely hosting isolated safety-critical tasks
 Bounding DRAM Interference in COTS Heterogeneous MPSoCs for Mixed Criticality Systems

(ABS)

**Engine Control Unit (ECU)** 

Mixed Criticality Systems

- Emerging Systems No longer solely hosting isolated safety-critical tasks
  - Execute tasks with different criticalities
  - Criticality *α* consequences of failure to meet requirements



### High-criticality tasks

- Airbag Control Unit (ACU)
- Anti-lock Braking System (ABS)
- Engine Control Unit (ECU)

### Mixed Criticality Systems

MOTIVATION

- Emerging Systems No longer solely hosting isolated safety-critical tasks
  - Execute tasks with different criticalities
  - Criticality  $\alpha$  consequences of failure to meet requirements





### Low-criticality tasks

- Air Conditioning Unit
- Connectivity Box
- Infotainment Unit

# Mixed Criticality Systems

MOTIVATION



### Mixed Criticality Systems

MOTIVATION

# Bounding DRAM Interference in COTS Heterogeneous MPSoCs for Mixed Criticality Systems



Shared IO







### Why MPSoCs?

- Low cost
- High performance
- Energy Efficiency
- Low time-to-market (3<sup>rd</sup> party IPs)

4

MOTIVATION

### MPSoCs

# Why Heterogenous MPSoCs?

 Variety of processing capabilities
 → Best-suits MCS conflicting requirements



### Heterogenous MPSoCs

MOTIVATION





### Heterogenous MPSoCs with Real-time Processors

MOTIVATION



Heterogenous MPSoCs with Real-time Processors

MOTIVATION

• DRAM Consists of multiple banks



## Bounding DRAM Interference in COTS Heterogeneous MPSoCs for Mixed Criticality Systems



• DRAM Consists of multiple banks



### Background



- DRAM Consists of multiple banks
- The memory controller (MC) manages accesses to DRAM





- DRAM Consists of multiple banks
- The memory controller (MC) manages accesses to DRAM
- A request in general consists of:
  - ACTIVATE command:
    - Bring data row from cells into sense amplifiers





- DRAM Consists of multiple banks
- The memory controller (MC) manages accesses to DRAM
- A request in general consists of:
  - ACTIVATE command:
    - Bring data row from cells into sense amplifiers
  - RD/WR commands:
    - To read/write from specific columns in the sense amplifiers





- DRAM Consists of multiple banks
- The memory controller (MC) manages accesses to DRAM
- A request in general consists of:
  - ACTIVATE command:
    - Bring data row from cells into sense amplifiers
  - RD/WR commands:
    - To read/write from specific columns in the sense amplifiers
  - PRECHARGE command:
    - to write back a previous row in the sense amplifiers before bringing the new one





- DRAM Consists of multiple banks
- The memory controller (MC) manages accesses to DRAM
- A request in general consists of:
  - ACTIVATE command:
    - Bring data row from cells into sense amplifiers
  - RD/WR commands:
    - To read/write from specific columns in the sense amplifiers
  - PRECHARGE command:
    - to write back a previous row in the sense amplifiers before bringing the new one
- All commands have associated timing constraints that have to be satisfied by the controller



6

DRAM



#### • P processing elements

- P<sub>cr</sub> critical + P<sub>ncr</sub> non-critical
- LLC is write-back write-allocate
  - Writes to DRAM are only cache evictions

7

- Single-channel single-rank DRAM subsystem
- N<sub>B</sub> DRAM banks

### System Overview

MODEL



#### P processing elements

- P<sub>cr</sub> critical + P<sub>ncr</sub> non-critical
- LLC is write-back write-allocate
  - Writes to DRAM are only cache evictions

7

- Single-channel single-rank DRAM subsystem
- N<sub>B</sub> DRAM banks

#### Goal:

Derive an upper bound on the delay incurred by any memory request of a critical PE

### System Overview





## System Details

MODEL



### System Details





## **Platform Instances**

MODEL



### **Platform Instances**

MODEL



| OS       | HW setup |      |                 |       |        |                 |         |                      |                  |                                   |                  |                 |         |        |
|----------|----------|------|-----------------|-------|--------|-----------------|---------|----------------------|------------------|-----------------------------------|------------------|-----------------|---------|--------|
| part     | thr      | r nr | wb=0,breorder=0 |       |        | wb=0,breorder=1 |         |                      | wb=1,breorder=0  |                                   |                  | wb=1,breorder=1 |         |        |
| part     | un       | pr   | 000             | IO-Cr | IO-All | 000             | IO-Cr   | IO-All               | 000              | IO-Cr                             | IO-All           | 000             | IO-Cr   | IO-All |
| _        | 0        | 0    |                 |       |        |                 |         |                      |                  |                                   |                  |                 |         |        |
| t-All    | 0        | 1    |                 |       |        |                 |         |                      |                  |                                   |                  |                 |         |        |
| Part-All | 1        | 0    |                 |       |        |                 |         |                      |                  |                                   |                  |                 |         |        |
|          | 1        | 1    |                 | Ob    | servat | tion 1          |         |                      |                  |                                   |                  |                 |         |        |
| ىد       | 0        | 0    |                 | Un    |        |                 |         | er-bank              |                  |                                   |                  | 1) and          |         |        |
| No-Part  | 0        | 1    |                 |       | write  | batchi          | ng is d | ss all co<br>eployed | omman<br>d (wb=( | ias (bre<br>)) <del>&gt;</del> ur | oraer=<br>hbound | led WC          | no<br>D |        |
| No-      | 1        | 0    |                 |       |        |                 |         |                      |                  |                                   |                  |                 |         |        |
|          | 1        | 1    |                 |       |        |                 |         |                      |                  |                                   |                  |                 |         |        |
|          | 0        | 0    |                 |       |        |                 |         |                      |                  |                                   |                  |                 |         |        |
| Part-Cr  | 0        | 1    |                 |       |        |                 |         |                      |                  |                                   |                  |                 |         |        |
| Par      | 1        | 0    |                 |       |        |                 |         |                      |                  |                                   |                  |                 |         |        |
|          | 1        | 1    |                 |       |        |                 |         |                      |                  |                                   |                  |                 |         |        |

144







| OS              | HW setup |    |      |         |        |           |         |        |                 |       |       |  |  |
|-----------------|----------|----|------|---------|--------|-----------|---------|--------|-----------------|-------|-------|--|--|
| part            | thr      | pr | wb=C | ),breor | der=0  | wb=0      | ),breor | der=1  | wb=1,breorder=> |       |       |  |  |
| ματι            | un       | pr | 000  | IO-Cr   | IO-All | 000       | IO-Cr   | IO-All | 000             | IO-Cr | IO-Al |  |  |
| _               | 0        | 0  |      |         |        |           |         |        |                 |       |       |  |  |
| t-Al            | 0        | 1  |      |         |        |           |         |        |                 |       |       |  |  |
| Part-All        | 1        | 0  |      |         |        |           |         |        |                 |       |       |  |  |
|                 | 1        | 1  |      |         |        |           |         |        |                 |       |       |  |  |
| ш               | 0        | 0  |      |         |        |           |         |        |                 |       |       |  |  |
| No-Part         | 0        | 1  |      |         |        | UNBOUNDED |         |        |                 |       |       |  |  |
| <sup>5</sup> No | 1        | 0  |      |         |        | UN        | BOONL   |        |                 |       |       |  |  |
| _               | 1        | 1  |      |         |        |           |         |        |                 |       |       |  |  |
|                 | 0        | 0  |      |         |        |           |         |        |                 |       |       |  |  |
| Part-Cr         | 0        | 1  |      |         |        |           |         |        |                 |       |       |  |  |
| Pari            | 1        | 0  |      |         |        |           |         |        |                 |       |       |  |  |
|                 | 1        | 1  |      |         |        |           |         |        |                 |       |       |  |  |

72

| OS       |     | HW setup |      |         |        |         |                             |        |                 |       |       |  |  |
|----------|-----|----------|------|---------|--------|---------|-----------------------------|--------|-----------------|-------|-------|--|--|
| nart     | thr | nr       | wb=0 | ,breor  | der=0  | wb=0    | ),breor                     | der=1  | wb=1,breorder=x |       |       |  |  |
| part     | un  | pr       | 000  | IO-Cr   | IO-All | 000     | IO-Cr                       | IO-All | 000             | IO-Cr | IO-Al |  |  |
| _        | 0   | 0        |      |         |        |         |                             |        |                 |       |       |  |  |
| Part-All | 0   | 1        |      |         |        |         |                             |        |                 |       |       |  |  |
| Part     | 1   | 0        |      |         |        |         |                             |        |                 |       |       |  |  |
|          | 1   | 1        |      |         |        |         |                             |        |                 |       |       |  |  |
| ц        |     | Obser    |      |         |        |         |                             |        |                 |       |       |  |  |
| No-Part  |     | Unbou    |      |         |        |         |                             |        |                 |       |       |  |  |
| Νo       | 1   | If thr=C |      | D-Part) | ((Par  | 1-01) 8 | <i>t</i> pr=0) <sup>.</sup> |        | Junaed          |       |       |  |  |
|          | 1   | 1        |      |         |        |         |                             |        |                 |       |       |  |  |
|          | 0   | 0        |      |         |        |         |                             |        |                 |       |       |  |  |
| Part-Cr  | 0   | 1        |      |         |        |         |                             |        |                 |       |       |  |  |
| Par      | 1   | 0        |      |         |        |         |                             |        |                 |       |       |  |  |
|          | 1   | 1        |      |         |        |         |                             |        |                 |       |       |  |  |

72

METHODOLOGY

| OS       |      |    |           |        | Н      | W setu          | р     |        |                 |           |       |  |  |
|----------|------|----|-----------|--------|--------|-----------------|-------|--------|-----------------|-----------|-------|--|--|
| nart     | thr  | nr | wb=0      | ,breor | der=0  | wb=0,breorder=1 |       |        | wb=1,breorder=x |           |       |  |  |
| part     | UIII | pr | 000       | IO-Cr  | IO-All | 000             | IO-Cr | IO-All | 000             | IO-Cr     | IO-Al |  |  |
|          | 0    | 0  |           |        |        |                 |       |        |                 |           |       |  |  |
| -All     | 0    | 1  |           |        |        |                 |       |        |                 |           |       |  |  |
| Part-All | 1    | 0  |           |        |        |                 |       |        |                 |           |       |  |  |
|          | 1    | 1  |           |        |        |                 |       |        |                 |           |       |  |  |
| ىر       | 0    | 0  | UNBOUNDED |        |        |                 |       |        |                 | UNBOUNDED |       |  |  |
| Part     | 0    | 1  | UN        | BOONL  |        | LINI            | BOUNE |        | UN              | BOONL     |       |  |  |
| No-Part  | 1    | 0  |           |        |        | UN              | BOONL | JED    |                 |           |       |  |  |
| _        | 1    | 1  |           |        |        |                 |       |        |                 |           |       |  |  |
|          | 0    | 0  | UN        | BOUNE  | DED    |                 |       |        | UNBOUNDED       |           |       |  |  |
| Part-Cr  | 0    | 1  |           |        |        |                 |       |        |                 |           |       |  |  |
| Pari     | 1    | 0  |           |        |        |                 |       |        |                 |           |       |  |  |
|          | 1    | 1  |           |        |        |                 |       |        |                 |           |       |  |  |

54

| OS              | HW setup                   |                                                      |                                             |           |             |          |                 |           |            |                   |        |  |
|-----------------|----------------------------|------------------------------------------------------|---------------------------------------------|-----------|-------------|----------|-----------------|-----------|------------|-------------------|--------|--|
| part            | thr                        | pr                                                   | wb=0,breorder=0 wb=0,breorder=1             |           |             |          | wb=1,breorder=x |           |            |                   |        |  |
| part            | un                         | pr                                                   | 000                                         | IO-Cr     | IO-All      | 000      | IO-Cr           | IO-All    | 000        | IO-Cr             | IO-All |  |
|                 | 0                          | 0                                                    |                                             |           |             |          |                 |           |            |                   |        |  |
| t-All           | 0                          | 1                                                    |                                             |           |             |          |                 |           |            |                   |        |  |
| Part-All        | 1                          | 0                                                    |                                             |           |             |          |                 |           |            |                   |        |  |
|                 | 1                          | 1 Observation 4:                                     |                                             |           |             |          |                 |           |            |                   |        |  |
|                 | L O                        | bservat                                              | tion 4:                                     |           |             |          |                 |           |            |                   |        |  |
| ب               | 0 P                        | art-All e                                            | effect                                      | oc pot cu | ffor latro  | bankra   | ordoring        | or confli | at interf  |                   | D      |  |
| Part            | 0 P                        | art-All e<br>Part-All<br>thr=x                       | effect<br>→ r <sub>ua</sub> do              |           | iffer Intra | -bank re | ordering        | or confli | ct interf  | erences:          | D      |  |
| No-Part         | 0 P                        | art-All e<br>Part-All<br>thr=x                       | effect<br>→ r <sub>ua</sub> do              |           | iffer Intra | -bank re | ordering        | or confli | ct interf  | erences:          | D      |  |
| No-Part         | 0 P<br>0 If                | art-All e<br>Part-All<br>thr=x                       | effect<br>→ r <sub>ua</sub> do              |           | iffer Intra | -bank re | ordering        | or confli | ict interf | erences:          | D      |  |
|                 | 0 P<br>0 If<br>1           | art-All e<br>Part-All<br>thr=x<br>If wb=             | effect<br>→ r <sub>ua</sub> doo<br>=0 → pip |           |             | -bank re | ordering        | or confli |            | erences:<br>BOUNI |        |  |
|                 | 0 P<br>0 If<br>1 1         | art-All e<br>Part-All<br>thr=x<br>If wb=<br>1        | effect<br>→ r <sub>ua</sub> doo<br>=0 → pip | e=x       |             | -bank re | ordering        | or confli |            |                   |        |  |
| Part-Cr No-Part | 0 P<br>0 If<br>1<br>1<br>0 | art-All e<br>Part-All f<br>thr=x<br>If wb=<br>1<br>0 | effect<br>→ r <sub>ua</sub> doo<br>=0 → pip | e=x       |             | -bank re | ordering        | or confli |            |                   |        |  |

54

| OS       |     |    |                  |           |   | HW setup | )          |        |                 |           |         |  |
|----------|-----|----|------------------|-----------|---|----------|------------|--------|-----------------|-----------|---------|--|
| port     | thr | pr | wb=0,breorder=0  |           |   | wb       | =0,breorde | er=1   | wb=1,breorder=x |           |         |  |
| part     | un  | þi | 000 IO-Cr IO-All |           |   | 000      | IO-Cr      | IO-All | 000             | IO-Cr     | IO-All  |  |
|          | 0   | 0  | confg1           |           |   |          |            |        |                 | confg12   | confg13 |  |
| t-All    | 0   | 1  |                  | confg2    |   |          |            |        | confg14         | confg15   | confg10 |  |
| Part-All | 1   | 0  |                  | confg1    |   |          |            |        | confg11         | confg12   | confg13 |  |
|          | 1   | 1  |                  | confg2    |   |          |            |        | confg14         | confg15   | confg1  |  |
|          | 0   | 0  |                  |           |   |          |            |        | UNBOUNDED       |           |         |  |
| No-Part  | 0   | 1  |                  | UNBOUNDED |   |          | NBOUNDE    |        | 0               |           | U       |  |
| νο       | 1   | 0  |                  |           |   |          | NBOONDE    |        |                 |           |         |  |
|          | 1   | 1  |                  |           |   |          |            |        |                 |           |         |  |
|          | 0   | 0  | UI               | NBOUNDE   | D |          |            |        |                 | UNBOUNDED |         |  |
| Part-Cr  | 0   | 1  |                  |           |   |          |            |        |                 |           |         |  |
| Part     | 1   | 0  |                  |           |   |          |            |        |                 |           |         |  |
|          | 1   | 1  |                  |           |   |          |            |        |                 |           |         |  |



METHODOLOGY

| OS       |     |        | HW setup                                                                                                                |                |          |         |            |        |                 |         |         |  |  |  |
|----------|-----|--------|-------------------------------------------------------------------------------------------------------------------------|----------------|----------|---------|------------|--------|-----------------|---------|---------|--|--|--|
| nort     | thr | -      | wb=0,breorder=0                                                                                                         |                |          | wb      | =0,breorde | er=1   | wb=1,breorder=x |         |         |  |  |  |
| part     | unr | pr     | 000                                                                                                                     | IO-Cr          | IO-All   | 000     | IO-Cr      | IO-All | 000             | IO-Cr   | IO-All  |  |  |  |
|          | 0   | 0      |                                                                                                                         | confg1         |          |         |            |        | confg11         | confg12 | confg13 |  |  |  |
| Part-All | 0   | 1      |                                                                                                                         | confg2         |          |         |            |        | confg14         | confg15 | confg1  |  |  |  |
| Part     | 1   | 0      |                                                                                                                         | confg1         |          |         |            |        | confg11         | confg12 | confg1  |  |  |  |
|          | 1   |        | ervatio                                                                                                                 |                |          |         |            |        |                 |         |         |  |  |  |
|          | 0   |        |                                                                                                                         |                | en wb=   |         |            |        |                 |         |         |  |  |  |
| No-Part  | 0   | If Par | If Part-Cr & wb=0 → r <sub>µa</sub> does not suffer Intra-bank reordering nor conflict interferences from critical PEs: |                |          |         |            |        |                 |         |         |  |  |  |
| 1-07     | 1   |        | <ul> <li>IO-Cr and OOO-All have same effect on WCD</li> </ul>                                                           |                |          |         |            |        |                 |         |         |  |  |  |
| -        | 1   | • 10   | J-Cr and                                                                                                                | 1000- <i>i</i> | All nave | same er | tect on    | WCD    |                 |         |         |  |  |  |
|          | 0   | 0      | U                                                                                                                       | NBOUND         | ED       |         |            |        | UNBOUNDED       |         |         |  |  |  |
| ې<br>۲   | 0   | 1      |                                                                                                                         |                |          |         |            |        |                 |         |         |  |  |  |
| Part-Cr  | 1   | 0      |                                                                                                                         |                |          |         |            |        |                 |         |         |  |  |  |
|          | 1   | 1      |                                                                                                                         |                |          |         |            |        |                 |         |         |  |  |  |

38

METHODOLOGY

| OS       | HW setup |          |                 |         |     |           |                 |         |         |                 |         |  |
|----------|----------|----------|-----------------|---------|-----|-----------|-----------------|---------|---------|-----------------|---------|--|
| part     | thr      | thr pr   | wb=0,breorder=0 |         |     | wb        | wb=0,breorder=1 |         |         | wb=1,breorder=x |         |  |
| part     |          | 000      | IO-Cr           | IO-All  | 000 | IO-Cr     | IO-All          | 000     | IO-Cr   | IO-All          |         |  |
|          | 0        | 0        |                 | confg1  |     |           |                 |         |         | confg12         | confg13 |  |
| t-All    | 0        | 1 confg2 |                 |         |     |           |                 | confg14 | confg15 | confg10         |         |  |
| Part-All | 1        | 0        |                 | confg1  |     |           |                 |         |         | confg12         | confg13 |  |
|          | 1        | 1        | confg2          |         |     |           |                 |         | confg14 | confg15         | confg10 |  |
|          | 0        | 0        | UNBOUNDED       |         |     |           |                 |         |         | UNBOUNDED       |         |  |
| No-Part  | 0        | 1        |                 |         |     | UNBOUNDED |                 |         | 0       | NBOONDL         | .0      |  |
| -<br>No- | 1        | 0        |                 |         |     | U         |                 | 10      |         |                 |         |  |
|          | 1        | 1        |                 |         |     |           |                 |         |         |                 |         |  |
|          | 0        | 0        | U               | NBOUNDE | D   |           |                 |         | U       | NBOUNDE         | D       |  |
| Part-Cr  | 0        | 1        | confg8          |         |     |           |                 |         |         |                 |         |  |
| Pari     | L Part   |          | con             | ıfg9    |     |           |                 |         |         |                 |         |  |
|          | 1        | 1        | conf            | fg10    |     |           |                 |         |         |                 |         |  |



| OS       |        |        | HW setup                                                                             |         |          |          |            |        |                 |         |        |  |  |
|----------|--------|--------|--------------------------------------------------------------------------------------|---------|----------|----------|------------|--------|-----------------|---------|--------|--|--|
| nort     | thr    |        | wb=0,breorder=0                                                                      |         |          | wb       | =0,breorde | er=1   | wb=1,breorder=x |         |        |  |  |
| part     | art th | pr     | 000                                                                                  | IO-Cr   | IO-All   | 000      | IO-Cr      | IO-All | 000             | IO-Cr   | IO-All |  |  |
|          | 0      | 0      |                                                                                      | confg1  |          |          |            |        | confg11         | confg12 | confg1 |  |  |
| Part-All | 0      | 1      |                                                                                      | confg2  |          |          |            |        | confg14         | confg15 | confg1 |  |  |
| Part     | 1      | 0      |                                                                                      | confg1  |          |          |            |        | confg11         | confg12 | confg1 |  |  |
|          | 1      |        | Observation 6:                                                                       |         |          |          |            |        |                 |         |        |  |  |
|          | 0      |        | rity effe                                                                            |         |          |          |            |        |                 |         |        |  |  |
| No-Part  | 0      | If pr= | If pr=1 & wb=0 $\rightarrow$ pipeline architecture of non-critical PEs has no effect |         |          |          |            |        |                 |         |        |  |  |
| 1-0N     | 1      | _      | on WCD:<br>IO-Cr and IO-All have same effect on WCD                                  |         |          |          |            |        |                 |         |        |  |  |
| -        | 1      | • 10   | 0-Cr and                                                                             |         | nave sai | ne effec |            | _D     |                 |         |        |  |  |
|          | 0      | 0      | U                                                                                    | NBOUNDE | D        |          |            |        | U               | NBOUNDE | D      |  |  |
| ې<br>ب   | 0      | 1      | con                                                                                  | ıfg8    |          |          |            |        |                 |         |        |  |  |
| Part-Cr  | 1      | 0      | con                                                                                  | ıfg9    |          |          |            |        |                 |         |        |  |  |
|          | 1      | 1      | cont                                                                                 | fg10    |          |          |            |        |                 |         |        |  |  |

35

| OS       | HW setup                                                                  |     |                 |         |     |           |                 |         |         |                 |         |  |
|----------|---------------------------------------------------------------------------|-----|-----------------|---------|-----|-----------|-----------------|---------|---------|-----------------|---------|--|
| part     | thr                                                                       | pr  | wb=0,breorder=0 |         |     | wb        | wb=0,breorder=1 |         |         | wb=1,breorder=x |         |  |
| part     | part thr pr                                                               | 000 | IO-Cr           | IO-All  | 000 | IO-Cr     | IO-All          | 000     | IO-Cr   | IO-All          |         |  |
|          | 0                                                                         | 0   |                 | confg1  |     |           |                 |         |         | confg12         | confg13 |  |
| Part-All | 0                                                                         | 1   |                 | confg2  |     |           |                 |         | confg14 | confg15         | confg16 |  |
| Part     | 눈         1         0         confg1           1         1         confg2 |     |                 |         |     |           | confg12         | confg13 |         |                 |         |  |
|          |                                                                           |     | confg2          |         |     |           |                 |         | confg14 | confg15         | confg1  |  |
|          | 0 0                                                                       |     | UNBOUNDED       |         |     | UNBOUNDED |                 |         |         |                 |         |  |
| Part     | 0                                                                         | 1   | UNBOUNDED       |         |     |           |                 |         | 0       | NBOONDE         | .0      |  |
| No-Part  | 1                                                                         | 0   |                 |         |     | UNBOUNDED |                 |         |         |                 |         |  |
|          | 1                                                                         | 1   |                 | Confg7  |     |           |                 |         |         |                 |         |  |
|          | 0                                                                         | 0   | U               | NBOUNDE | D   |           |                 |         | U       | NBOUNDE         | D       |  |
| Part-Cr  | 0                                                                         | 1   | confg8          |         |     |           |                 |         |         |                 |         |  |
| Par      | 1                                                                         | 0   | con             | confg9  |     |           |                 |         |         |                 |         |  |
|          | 1                                                                         | 1   | conf            | fg10    |     |           |                 |         |         |                 |         |  |



METHODOLOGY

10

| OS       |     | HW setup |                                                                                        |          |            |          |                 |         |         |         |  |  |
|----------|-----|----------|----------------------------------------------------------------------------------------|----------|------------|----------|-----------------|---------|---------|---------|--|--|
| part     | thr | pr       | wb=0,breord                                                                            | wb=      | =0,breorde | er=1     | wb=1,breorder=x |         |         |         |  |  |
| part     | un  | pr       | 000 IO-Cr                                                                              | IO-All   | 000        | IO-Cr    | IO-All          | 000     | IO-Cr   | IO-All  |  |  |
|          | 0   | 0        | confg1                                                                                 |          |            |          |                 | confg11 | confg12 | confg13 |  |  |
| Part-All | 0   | 1        | confg2                                                                                 |          |            |          | confg14         | confg15 | confg16 |         |  |  |
| Part     | 1   | 0        | confg1                                                                                 |          |            |          |                 | confg11 | confg12 | confg13 |  |  |
|          | 1   |          |                                                                                        |          |            |          |                 |         |         |         |  |  |
|          | 0   |          | rity with Part                                                                         | -Cr effe | ect        |          |                 |         |         |         |  |  |
| No-Part  | 0   | _        | <ul> <li>thr=x</li> <li>If wb=0 → pipe=x</li> <li>Same as Part-All effect!!</li> </ul> |          |            |          |                 |         |         |         |  |  |
| No-      | 1   | • 11     |                                                                                        |          |            |          |                 |         |         |         |  |  |
|          | 1   |          |                                                                                        | Sallie   |            | t-All el | iectii          |         |         |         |  |  |
|          | 0   | 0        | UNBOUNDI                                                                               | ED       |            |          |                 | U       | NBOUNDE | D       |  |  |
| Part-Cr  | 0   | 1        | confg8                                                                                 |          |            |          |                 |         |         |         |  |  |
| Par      | 1   | 0        | confg9                                                                                 |          |            |          |                 |         |         |         |  |  |
|          | 1   | 1        | confg10                                                                                |          |            |          |                 |         |         |         |  |  |

34

| OS       |     |        |                 |         |        | HW setup        |         |        |                 |         |         |
|----------|-----|--------|-----------------|---------|--------|-----------------|---------|--------|-----------------|---------|---------|
| part     | thr | thr pr | wb=0,breorder=0 |         |        | wb=0,breorder=1 |         |        | wb=1,breorder=x |         |         |
| part     | un  | pr     | 000             | IO-Cr   | IO-All | 000             | IO-Cr   | IO-All | 000             | IO-Cr   | IO-All  |
|          | 0   | 0      |                 | confg1  |        |                 |         |        |                 | confg12 | confg13 |
| Part-All |     |        |                 | confg2  |        |                 |         |        |                 | confg15 | confg16 |
| Part     | 1   | 0      | confg1          |         |        |                 |         |        | confg11         | confg12 | confg13 |
|          | 1   | 1      | confg2          |         |        |                 |         |        | confg14         | confg15 | confg16 |
|          | 0 0 |        | UNBOUNDED       |         |        |                 |         |        | UNBOUNDED       |         |         |
| Part     | 0   | 1      | UNBOUNDED       |         |        | UNBOUNDED       |         |        | 0               |         | .0      |
| No-Part  | 1   | 0      |                 |         |        | U               | NECONDE | .0     |                 |         |         |
|          | 1   | 1      |                 | Confg7  |        |                 |         |        |                 |         |         |
|          | 0   | 0      | U               | NBOUNDE | D      |                 |         |        | U               | NBOUNDE | D       |
| Part-Cr  | 0   | 1      |                 | confg8  |        |                 |         |        |                 | confg24 | confg25 |
| Pari     | 1   | 0      | cor             | nfg9    |        |                 |         |        |                 |         |         |
|          | 1   | 1      |                 | confg8  |        |                 |         |        | confg23         | confg24 | confg25 |

28

| OS       |                    |    |                 |                  |        | HW setup  | )          |         |           |                 |         |  |
|----------|--------------------|----|-----------------|------------------|--------|-----------|------------|---------|-----------|-----------------|---------|--|
| part     | thr                | pr | wb=0,breorder=0 |                  |        | wb        | =0,breorde | er=1    | wb        | wb=1,breorder=x |         |  |
| part     |                    | pr | 000             | IO-Cr            | IO-All | 000       | IO-Cr      | IO-All  | 000       | IO-Cr           | IO-All  |  |
|          | 0                  | 0  | confg1          |                  |        |           |            |         | confg11   | confg12         | confg13 |  |
| t-All    | 0                  | 1  |                 | confg2<br>confg1 |        |           |            |         |           | confg15         | confg16 |  |
| Part-All | 1                  | 0  |                 |                  |        |           |            |         |           | confg12         | confg13 |  |
|          | 1                  | 1  |                 | confg2           |        |           |            |         | confg14   | confg15         | confg16 |  |
|          | 0                  | 0  |                 | UNBOUNDED        |        |           |            |         | UNBOUNDED |                 |         |  |
| No-Part  | 0                  | 1  | UNDOUNDED       |                  |        | UNBOUNDED |            |         | 0         | NBOONDL         | .0      |  |
| νο       | 1                  | 0  | confg3          | confg4           | Confg5 | U         |            | 0       | confg17   | confg18         | confg19 |  |
|          | 1                  | 1  | Confg6          | Cor              | nfg7   |           |            | confg20 | confg21   | confg22         |         |  |
|          | 0                  | 0  | U               | NBOUNDE          | D      |           |            |         | U         | NBOUNDE         | D       |  |
| Part-Cr  | 0                  | 1  |                 | Confg8           |        |           |            | confg23 | confg24   | confg25         |         |  |
| Par      | 1 0 confg9 confg10 |    | confg10         |                  |        |           | confg26    | confg27 | confg28   |                 |         |  |
|          | 1                  | 1  |                 | confg8           |        |           |            |         | confg23   | confg24         | confg2  |  |

144 Instances  $\rightarrow$  28 Configurations

# **General Observations**

28

- Consider all timing constraints generated by commands of interfering requests of other PEs serviced between the times when r<sub>ua</sub> arrives and finishes
- + Delays due to command bus contention
- Compute WCD for each configuration?



- Consider all timing constraints generated by commands of interfering requests of other PEs serviced between the times when r<sub>ua</sub> arrives and finishes
- + Delays due to command bus contention
- Compute WCD for each configuration?
  - Still too much
  - Not general enough



METHODOLOGY

r<sub>ua</sub> finishes

- Consider all timing constraints generated by commands of interfering requests of other PEs serviced between the times when r<sub>ua</sub> arrives and finishes
- + Delays due to command bus contention



 We classify interfering requests (aka delay sources) into four types → <u>causing four basic</u> <u>interferences:</u>



 We classify interfering requests (aka delay sources) into four types → <u>causing four basic</u> <u>interferences:</u>



12

1. Inter-bank interference (requests to other banks)



- We classify interfering requests (aka delay sources) into four types → <u>causing four basic</u> <u>interferences:</u>
  - 1. Inter-bank interference (requests to other banks)
  - 2. Write batch Interference (only for R/W reordering)



12



- We classify interfering requests (aka delay sources) into four types → <u>causing four basic</u> <u>interferences:</u>
  - 1. Inter-bank interference (requests to other banks)
  - 2. Write batch Interference (only for R/W reordering)
  - 3. Conflict interference (requests to same bank different rows arrived before r<sub>ua</sub>)

WCD = $L^{InterB}(N^{InterB}, wb)$  $+wb \times L^{WB}(N^{WB})$  $+L^{Conf}(N^{Conf})$ 

12



- We classify interfering requests (aka delay sources) into four types → <u>causing four basic</u> <u>interferences:</u>
  - 1. Inter-bank interference (requests to other banks)
  - 2. Write batch Interference (only for R/W reordering)
  - 3. Conflict interference (requests to same bank different rows arrived before r<sub>ua</sub>)





- We classify interfering requests (aka delay sources) into four types → <u>causing four basic</u> <u>interferences:</u>
  - 1. Inter-bank interference (requests to other banks)
  - 2. Write batch Interference (only for R/W reordering)
  - 3. Conflict interference (requests to same bank different rows arrived before r<sub>ua</sub>)
  - 4. Intra-bank Reorder interference (FR-FCFS)

r<sub>ua</sub>

arrives



N<sup>R</sup>eorder

 $N^{Conf} \times L^{InterB}$ 

**WCD** 

Memory Delay Building Blocks

N<sup>WB</sup>

 $L^{WB}$ 

**N**<sup>Conf</sup>

METHODOLOGY

**I**InterB

N<sup>nterB</sup> r<sub>ua</sub>

finishes

- We classify interfering requests (aka delay sources) into four types → <u>causing four basic</u> <u>interferences:</u>
  - 1. Inter-bank interference (requests to other banks)
  - 2. Write batch Interference (only for R/W reordering)
  - 3. Conflict interference (requests to same bank different rows arrived before r<sub>ua</sub>)
  - 4. Intra-bank Reorder interference (FR-FCFS)

12 WCD = $L^{InterB}(N^{InterB}, wb)$  $+wb \times L^{WB}(N^{WB})$  $+L^{Conf}(N^{Conf})$  $+N^{Conf} \times L^{InterB}(N^{InterB}, wb)$  $+ L^{Reorder}(N^{Reorder}, wb)$  $+N^{Reorder} \times L^{InterB}_{CAS}(N^{InterB}, wb)$ 

#### 

### Memory Delay Building Blocks

 Let's assume we know # of interfering requests (Ns), how to compute the latency components (Ls)?

→ Ls only depend on Ns and JEDEC "known" timing constraints

WCD = (13)  $L^{InterB}(N^{InterB}, wb)$   $+wb \times L^{WB}(N^{WB})$   $+L^{Conf}(N^{Conf})$   $+N^{Conf} \times L^{InterB}(N^{InterB}, wb)$   $+L^{Reorder}(N^{Reorder}, wb)$   $+N^{Reorder} \times L^{InterB}_{CAS}(N^{InterB}, wb)$ 

#### Memory Delay Building Blocks

 Let's assume we know # of interfering requests (Ns), how to compute the latency components (Ls)?

# → Ls only depend on Ns and JEDEC "known" timing constraints

 $\rightarrow L^{Conf}$  as example

 $L^{Conf}(N^{Conf}) = N^{Conf} \times (MAX(tRAS, tRCD + tWL + tB + tWR) + tRP)$ 



### Memory Delay Building Blocks

- Let's assume we know # of interfering requests (*Ns*), how to compute the latency components (*Ls*)?
  - → Ls only depend on Ns and JEDEC "known" timing constraints
  - $\rightarrow L^{Conf}$  as example

 Configurationindependent DRAM delay components <sup>(C)</sup>

 $L^{Conf}(N^{Conf}) = N^{Conf} \times (MAX(tRAS, tRCD + tWL + tB + tWR) + tRP)$ 



### Memory Delay Building Blocks

METHODOLOGY

13

- Let's assume we know # of interfering requests (Ns), how to compute the latency components (Ls)?
  - → Ls only depend on Ns and JEDEC "known" timing constraints
- Now: It only remains to compute the Ns.



 Configurationindependent DRAM delay components <sup>(C)</sup>

#### Memory Delay Building Blocks

• Now: It only remains to compute the Ns.  $\rightarrow$  Config. dependent

# # of Interfering Requests

METHODOLOGY

14

- Now: It only remains to compute the Ns.  $\rightarrow$  Config. dependent
- Take confg3 as an example:

- no WB
- FR-FCFS thr

• no FP

- Inter-bank reorder among different types only (breorder=0)
- All PEs are OOO
- no partitioning

# # of Interfering Requests

METHODOLOGY

14

- Now: It only remains to compute the Ns.  $\rightarrow$  Config. dependent
- Take confg3 as an example:
- 1. Conflicts  $(N^{Conf})$ :
  - OOO-All  $\rightarrow$  each PE has PR pending reqs
  - No FP  $\rightarrow$  critical and non-critical scheduled similarly
  - Then  $N^{Conf} = (P 1) \times PR$  requests can conflict with  $r_{ua}$

- no WB
- FR-FCFS thr
- no FP
- Inter-bank reorder among different types only (breorder=0)
- All PEs are OOO
- no partitioning

# # of Interfering Requests

- Now: It only remains to compute the Ns.  $\rightarrow$  Config. dependent
- Take confg3 as an example:
- 1. Conflicts  $(N^{Conf})$ :
  - OOO-All  $\rightarrow$  each PE has PR pending reqs
  - No FP  $\rightarrow$  critical and non-critical scheduled similarly
  - Then  $N^{Conf} = (P 1) \times PR$  requests can conflict with  $r_{ua}$
- 2. Reorder (N<sup>Reorder</sup>):
  - FR-FCFS thr  $\rightarrow$  max of  $N^{Reorder} = N_{thr}$  requests can be reordered before  $r_{ua}$

- no WB
- FR-FCFS thr
- no FP
- Inter-bank reorder among different types only (breorder=0)
- All PEs are OOO
- no partitioning

# # of Interfering Requests

- Now: It only remains to compute the Ns.  $\rightarrow$  Config. dependent
- Take confg3 as an example:
- 1. Conflicts  $(N^{Conf})$ :
  - OOO-All  $\rightarrow$  each PE has PR pending reqs
  - No FP  $\rightarrow$  critical and non-critical scheduled similarly
  - Then  $N^{Conf} = (P 1) \times PR$  requests can conflict with  $r_{ua}$
- 2. Reorder (N<sup>Reorder</sup>):
  - FR-FCFS thr → max of  $N^{Reorder} = N_{thr}$  requests can be reordered before  $r_{ua}$
- 3. Inter-bank (N<sup>InterB</sup>):
  - RR arbiter and no FP  $\rightarrow$  max of  $N^{InterB} = N_B 1$  reqs from other banks can be reordered before  $r_{ua}$

- no WB
- FR-FCFS thr
- no FP
- Inter-bank reorder among different types only (breorder=0)
- All PEs are OOO
- no partitioning

# # of Interfering Requests

- Now: It only remains to compute the Ns.  $\rightarrow$  Config. dependent
- Take confg3 as an example:
- 1. Conflicts  $(N^{Conf})$ :
  - OOO-All  $\rightarrow$  each PE has PR pending reqs
  - No FP  $\rightarrow$  critical and non-critical scheduled similarly
  - Then  $N^{Conf} = (P 1) \times PR$  requests can conflict with  $r_{ua}$
- 2. Reorder (N<sup>Reorder</sup>):
  - FR-FCFS thr → max of  $N^{Reorder} = N_{thr}$  requests can be reordered before  $r_{ua}$
- 3. Inter-bank (N<sup>InterB</sup>):
  - RR arbiter and no FP  $\rightarrow$  max of  $N^{InterB} = N_B 1$  reqs from other banks can be reordered before  $r_{ua}$
- 4. Write Batch (N<sup>WB</sup>)
  - No WB  $\rightarrow N^{WB} = 0$

# # of Interfering Requests

METHODOLOGY

#### confg3:

- no WB
- FR-FCFS thr
- no FP
- Inter-bank reorder among different types only (breorder=0)
- All PEs are OOO
- no partitioning

- **Follow Same approach for all configurations**

 $confg_7$ 

 $confg_8$ 

 $confg_9$ 

- OOO-All → each NReorder NConf  $N^{InterB}$ Configuration ■ No FP  $\rightarrow$  critical  $confg_1$  $N_B - 1$ 0 0 • Then  $N^{Conf} =$  $confg_2$ 0 0  $N_{Bcr}$  $(P-1) \cdot PR$ Nthr  $N_B - 1$  $confg_3$  $P_{ncr} \cdot PR + P_{cr} - 1$ Nthr  $N_B - 1$  $confg_4$  $N_B - 1$  $confg_5$ P-1Nthr  $(P_{cr}-1) \cdot PR + 1$  $confg_6$ Nthr  $N_{B} - 1$

 $P_{cr}$ 

 $P_{ncr} \cdot PR$ 

- - $P_{ncr}$  $confg_{10}$ RR arbiter and
- - No WB  $\rightarrow N^{WB} = 0$

# # of Interfering Requests

METHODOLOGY

#### 14

 $N_B - 1$ 

 $N_B - 1$ 

 $N_B - 1$ 

 $N_B - 1$ 

Nthr

0

Nthr

 $N_{thr}$ 

| PEs        | A private 16KB L1 and a shared 1MB L2 cache<br>An in-order PE has a maximum of one pending request to the DRAM<br>An OOO PE has a maximum of 4 pending requests to the DRAM (PR = 4)<br>Four-processor system unless otherwise specified                                                                                                                                                                                                         |  |  |  |  |  |  |  |  |  |
|------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|--|--|--|--|--|--|--|--|
| OS Mapping | Through the virtual-to-physical address mapping component at MacSim's frontend<br>Based on the configuration, we enable the corresponding partitioning (Part-All, Part-Cr, or No-Part)                                                                                                                                                                                                                                                           |  |  |  |  |  |  |  |  |  |
| DRAM       | DDR3-1333H with single channel, single rank, and 8 banks                                                                                                                                                                                                                                                                                                                                                                                         |  |  |  |  |  |  |  |  |  |
| MC         | <ul> <li>Based on the configuration,</li> <li>Per-bank queues with RR among banks and FR-FCFS arbitration within each bank</li> <li>Based on the configuration: <ul> <li>critical PEs can be assigned higher priority than non-critical PEs</li> <li>enable or disable the threshold for FR-FCFS</li> <li>For enabled threshold:N<sub>thr</sub> = 8, unless otherwise specified</li> <li>enable or disable write batching</li> </ul> </li> </ul> |  |  |  |  |  |  |  |  |  |
| Benchmarks | EEMBC Automotive  • The two critical PEs execute a2time and rspeed • The two non-critical PEs execute matrix and aifftr                                                                                                                                                                                                                                                                                                                          |  |  |  |  |  |  |  |  |  |
|            | Synthetic• Each of the critical PEs execute one instance of the latency benchmark Each of the<br>non-critical PEs execute one instance of the Bandwidth benchmark                                                                                                                                                                                                                                                                                |  |  |  |  |  |  |  |  |  |

# **Evaluation Setup**





# WCD of Critical Processors

#### RESULTS



# WCD of Critical Processors

RESULTS









Compared to Confg 6 (No-Part):

- Confg 2 (Part-All):
  - 96% less WCD
  - 60% BW degradation
- Confg 8 (Part-Cr + FP):
  - 89% less WCD
  - 0.85% BW degradation

| 3000<br>2500<br>2000<br>1500<br>1000<br>500 |                            | O Ø PE1                    |                            | confg <sub>2</sub>         |                            | e confg <sub>6</sub> xx    |                            | confg <sub>8</sub>         |
|---------------------------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|
| 0                                           | IO-All<br>IO-Cr<br>000-All | 10-All<br>10-Cr<br>000-All | 10-All<br>10-Cr<br>000-All | 10-All<br>10-Cr<br>000-All | 10-All<br>10-Cr<br>000-All | IO-All<br>IO-Cr<br>000-All | IO-All<br>IO-Cr<br>000-All | 10-All<br>10-Cr<br>000-All |
|                                             | noPr                       | pr                         | noPr                       | pr                         | noPr                       | pr                         | noPr                       | pr                         |
|                                             | no                         | Thr                        | tł                         | ٦r                         | tl                         | nr                         | thr                        |                            |
|                                             |                            | par                        | tAll                       |                            | nol                        | Part                       | partCr                     |                            |

# Bandwidth



17

18

- Normalized to WB-expr
- WB-analytical is very pessimistic
- WB improves avg case
  - noWb-expr is 2.84x on average as compared to Wb-expr
  - even reaches 10x



## Write Batching Effect

RESULTS





RESULTS

• Confg. 1 & 2 & 8 offers complete isolation

- Confg. 1 and 2: Part-All
- Confg. 8: Part-Cr with fixed priority



RESULTS



RESULTS



- Confg 6 & 7 offers isolation from non-cr PEs
- Confg 3,4,9 are more vulnerable to WCD when Pncr



## Sensitivity to # Processors

## RESULTS

19



## Sensitivity to # Processors

## RESULTS

- Confg 1 & 2 & 8 offers complete isolation
- Confg 6 & 7 offers isolation from non-cr PEs
- Confg 9 & 10 offers isolation from cr PEs
- Confg 3,4,9 are more vulnerable to WCD<sup>1</sup> when Pncr<sup>1</sup>
- Confg 3,6 are more vulnerable to WCDT when Pcr T



RESULTS



## Sensitivity to FR-FCFS thr.

## RESULTS

• Heterogeneous MPSoCs are important for Mixed Criticality Systems

21

- Heterogeneous MPSoCs are important for Mixed Criticality Systems
- We derived a generalized analysis that bounds the per-request DRAM interference delay in MPSoCs

- Heterogeneous MPSoCs are important for Mixed Criticality Systems
- We derived a generalized analysis that bounds the per-request DRAM interference delay in MPSoCs



• We derived a generalized analysis that bounds the per-request DRAM interference delay in MPSoCs

21





## • We derived a generalized analysis that bounds the per-request DRAM

## Summary & Conclusions

**MC** Policies

 We derived a generalized analysis that bounds the per-request DRAM interference delay in MPSoCs
 28 configurations









Row Conflict

Requests arrived before the one

under analysis and are targeting different rows

**∧***i*Conf

Requests targeting different

banks and are serviced before the one under analysis because

of the RR policy

## Summary & Conclusions

**MC** Policies

#### interference dela R/W Reorder 1: write batching 0: no write batching MPSoC **FR-FCFS** Threshold Platform 1: FR-FCFS is capped Instances 0: no cap on Fl FCFS 1: Critical PEs are priority 0: no priority 144 different platfo Applications Mem OS Depe OS

# • We derived a ger 1. DRAM's WCD significantly depends on MPSoC features

21

#### Main lessons:

 We derived a ger

 DRAM's WCD significantly depends on MPSoC features

 interference dela 2. Identified features that lead to unbounded WCD

21



### Main lessons:

• We derived a ger 1. interference dela 2.



DRAM's WCD significantly depends on MPSoC features
 Identified features that lead to unbounded WCD
 leveraging existing features such as PE prioritization can allow

the designer to better trade-off the maximum delay for critical applications and the bandwidth for non-critical ones.

21

We derived a ger interference dela 2.



#### Main lessons:

DRAM's WCD significantly depends on MPSoC features Identified features that lead to unbounded WCD leveraging existing features such as PE prioritization can allow the designer to better trade-off the maximum delay for critical applications and the bandwidth for non-critical ones. There is interdependency among the effects of the features on both the delay and the bandwidth. Existence of some features can countermand the effect of other features

21

- Confg 1 &2 &8 offers complete isolation from FR-FCFS reordering
- Confg. 1 and 2: Part-All
- Confg. 8: Part-Cr with fixed priority



## Sensitivity to FR-FCFS thr.

# RESULTS

- Confg 1 &2 &8 offers complete isolation from FR-FCFS threshold
- Configs 3-7 & 10 scales linearly with FR-FCFS threshold

- Slope is the same for these configs
- L<sup>Reorder</sup> component depends only on thr and JEDEC constraints
- Reordering has huge impact on WCD





RESULTS

## Sensitivity to FR-FCFS thr.

### We derived a ger 1. DR



#### Main lessons:

DRAM's WCD significantly depends on MPSoC features
Identified features that lead to unbounded WCD
leveraging existing features such as PE prioritization can allow the designer to better trade-off the maximum delay for critical applications and the bandwidth for non-critical ones.
There is interdependency among the effects of the features on both the delay and the bandwidth. Existence of some features can countermand the effect of other features
Although write batching mechanism works well in the average case, it unfortunately induces pathological cases that result in high bounds on per-request delay

21



#### Main lessons:

- 1. DRAM's WCD significantly depends on MPSoC features
- 2. Identified features that lead to unbounded WCD
- 3. leveraging existing features such as PE prioritization can allow the designer to better trade-off the maximum delay for critical applications and the bandwidth for non-critical ones.
- 4. There is interdependency among the effects of the features on both the delay and the bandwidth. Existence of some features can countermand the effect of other features
- 5. Although write batching mechanism works well in the average case, it unfortunately induces pathological cases that result in high bounds on per-request delay