AI/ML: Use ML techniques for layer stackup modeling

Note:

  • Dataset and ipython notebook of this post is available at SPISim’s github page: [HERE]
  • This notebook may be better rendered by nbviewer and can be viewed  [HERE].

Motivation:

When planning a PCB layer stackup, often time we would like to know the trade-off between various layout options vs their signal integrity performance. For example, a wider trace may provides smaller impedance yet occupate more routing area. Narrow down the spacing between differential pairs may save some spaces but will also increase crosstalk. Most EDA tool involving system level signal integrity analysis provides "transmission line calculator" like shown below for designer to quickly make estimation and determine the trade-off:
TLineCalc

However, all such "calculators" I have seen, even in a commercial one, only consider the single trace or one differential pair itself. They do not take take crosstalks into account. More over, stackup parameters such as conductivity and permetivity must be entered individually instead of a range. As a results, user can't not easily visualize relationships between performance parameters vs the stackup properties. Thus, an enhanced version of such "T-Line calculator", which can address the aformentioned gaps will be very useful. Such tool requires a prediction model to link between various stackup parameters to their performance targets. Data science/machine learning techniques can thus be used to build such model.

Problem Statements:

We would like to build a prediction model such that given a set of stackup parameters such as trace width and spacing etc, its performance such as impedance, attenuations, near-end and far-end crosstalk can be quickly estimated. This model can then be deployed into a stand-alone tool for range based sweep such that a visual plot can be generated to provide relations between various parameters to decide design trade-off.

Generate Data:

Overview:

The model to be built here is for nominal (i.e. numerical) prediction with around 10 attributes, i.e. input variables. Various stackup configurations will be generated via sampling and their corresponding stackup model, in the form of frequency dependent R/L/G/C matrices will be simulated via field solver. Such process are deterministics. Post process steps will read these solved model and calculate performance. Here we define performance to be predicted as impedance, attenuation, near-end/far-end crosstalks and propagation speed.

Define stakup structure:

There are many possible stakup structures as shown below. For more accurate prediction, we are going to generate one prediction model per structure. Presets

Use three single-ended traces (victim in the middle) in strip-line setup as an example, various attributes may be defined as shown below: Setup

These parameters, such as S(Spacing), W(Width), Sigma(Conductivity), Er(Permitivity), H(Height) etc are represented as varaibles to be sampled.

Define sampling points:

Next step is to define ranges of variable values and sampling points. Since there are about 10 parameters, full combinatorial data will be impractical. Thus we may need to apply sampling algorithms such as design-of-experiments or spacing filling etc to establish best coverage of the solution space. For this setup, we have generate 10,000 cases to be simulated. Sample

Generate inputs setup and simulate:

Once we have sample points, layer stackup configurations to the solver to be used will be generated. Each field solver has different syntax thus a flow will be needed... TProFlow

In this case, we use HSpice from Synopsys for field solver, thus each of 10K parameter combinations will be used to generate their spice input files for simulation: Simulate

The next step is to perforam circuit simulation for all these cases. This may be a time-consuming process so a distributed environment or simulation farm may be used.

Performance measurement:

The outcome of each simulation is a frequency dependent tabular model, corresponding to its layer stackup settings. HSpice's tabular format looks like this: Tabular Next step is to load these models and do performance measurement: Measure Matrix manipulation such as eigen-value decomponsition will be applied in order to obtain the characteristic impedance and propagation speed etc. Measurement output of each model should be a set of parameters which will be combined with original inputs to form the dataset for our prediction modeling.

Prepare Data:

From this point, we can start the modeling process using python and various packages.

In [101]:
%matplotlib inline

## Initial set-up for data
import os
import pandas as pd
import matplotlib
import matplotlib.pyplot as plt
import numpy as np

prjHome = 'C:/Temp/WinProj/LStkMdl'
workDir = prjHome + '/wsp/'
srcFile = prjHome + '/dat/SLSE3.csv'

def save_fig(fig_id, tight_layout=True, fig_extension="png", resolution=300):
    path = os.path.join(workDir, fig_id + "." + fig_extension)
    print("Saving figure", fig_id)
    if tight_layout:
        plt.tight_layout()
    plt.savefig(path, format=fig_extension, dpi=resolution)
In [102]:
# Let's read the data and do some statistic
srcData = pd.read_csv(srcFile)

# Take a peek:
srcData.head()
Out[102]:
H1 H3 ER1 ER3 TD1 TD3 H2 W S EDW SIGMA H FNAME Z0SE(1_SE) S0SE(1_SE) KBSENB(1_1) KFSENB(1_1) A(1_1)
0 4.9311 5.5859 3.6964 3.3277 0.013771 0.030565 0.91236 7.5573 30.6610 0.177530 50000000.0 15 SPIMDL00001.TAB 37.715088 1.601068e+08 0.000035 -7.297762e-15 0.497289
1 4.8918 5.3156 4.0410 3.8404 0.022008 0.008600 0.53126 7.0378 6.9217 0.733370 50000000.0 15 SPIMDL00002.TAB 39.831954 1.510345e+08 0.019608 -1.093612e-12 0.249591
2 1.7907 11.5230 3.6984 3.3534 0.013859 0.045137 2.00990 3.6964 27.3230 0.707750 50000000.0 15 SPIMDL00003.TAB 35.663928 1.587798e+08 0.000305 -2.797417e-13 0.705953
3 2.8595 4.7259 3.9605 3.3981 0.010481 0.018028 0.36547 2.3872 7.2803 0.735210 50000000.0 15 SPIMDL00004.TAB 59.456438 1.558444e+08 0.010125 -5.478920e-12 0.327952
4 5.5946 8.2553 3.4176 3.3249 0.024434 0.003663 2.08810 6.3776 20.1680 0.053585 50000000.0 15 SPIMDL00005.TAB 46.082722 1.633106e+08 0.004136 -4.787903e-13 0.169760
In [103]:
srcData.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10000 entries, 0 to 9999
Data columns (total 18 columns):
H1             10000 non-null float64
H3             10000 non-null float64
ER1            10000 non-null float64
ER3            10000 non-null float64
TD1            10000 non-null float64
TD3            10000 non-null float64
H2             10000 non-null float64
W              10000 non-null float64
S              10000 non-null float64
EDW            10000 non-null float64
SIGMA          10000 non-null float64
H              10000 non-null int64
FNAME          10000 non-null object
Z0SE(1_SE)     10000 non-null float64
S0SE(1_SE)     10000 non-null float64
KBSENB(1_1)    9998 non-null float64
KFSENB(1_1)    9998 non-null float64
A(1_1)         10000 non-null float64
dtypes: float64(16), int64(1), object(1)
memory usage: 1.4+ MB
In [104]:
srcData.describe()
Out[104]:
H1 H3 ER1 ER3 TD1 TD3 H2 W S EDW SIGMA H Z0SE(1_SE) S0SE(1_SE) KBSENB(1_1) KFSENB(1_1) A(1_1)
count 10000.000000 10000.000000 10000.000000 10000.000000 10000.000000 10000.000000 10000.000000 10000.000000 10000.000000 10000.000000 10000.0 10000.0 10000.000000 1.000000e+04 9.998000e+03 9.998000e+03 10000.000000
mean 4.199999 7.500007 3.850001 3.850001 0.025000 0.025000 1.400000 5.500001 18.500007 0.375000 50000000.0 15.0 42.985345 1.533140e+08 1.475118e-02 -2.155868e-12 0.509534
std 1.616661 3.175591 0.490773 0.490772 0.014434 0.014434 0.635117 2.020829 9.526750 0.216515 0.0 0.0 172.163835 7.085288e+06 2.847164e-02 3.787910e-11 0.288974
min 1.400000 2.001000 3.000000 3.000100 0.000003 0.000002 0.300140 2.000100 2.002600 0.000002 50000000.0 15.0 11.630424 1.384625e+08 1.009146e-10 -2.796522e-09 0.011486
25% 2.799925 4.750325 3.425075 3.425050 0.012500 0.012501 0.850038 3.750350 10.250000 0.187510 50000000.0 15.0 32.173933 1.479284e+08 2.216767e-04 -1.112024e-12 0.328938
50% 4.200000 7.500500 3.850000 3.849950 0.025000 0.025000 1.399950 5.500150 18.500000 0.374985 50000000.0 15.0 40.031614 1.527750e+08 1.969018e-03 -1.928585e-15 0.507237
75% 5.600075 10.249500 4.274950 4.274950 0.037499 0.037498 1.949875 7.249650 26.748000 0.562507 50000000.0 15.0 48.688036 1.583456e+08 1.376321e-02 7.447876e-13 0.680500
max 6.999900 13.000000 4.699800 4.699900 0.050000 0.049995 2.500000 8.999600 34.999000 0.749930 50000000.0 15.0 17097.973570 1.725066e+08 3.184933e-01 9.613918e-11 14.524624

Note that:

  • Sigma(Conductivity) and H(default layer height) are constants in this setup;
  • FNAME (File name) is not needed for modeling
  • Z0 (impedance) has outliers
  • Forward/Backward crosstalk (Kb/kf) have missing terms
In [105]:
# drop constant and file name columns
stkData = srcData.drop(columns=['H', 'SIGMA', 'FNAME'])
In [106]:
# plot distributions before dropping measurement outliers
stkData.hist(bins=50, figsize=(20,15))
save_fig("attribute_histogram_plots")
plt.show()
Saving figure attribute_histogram_plots
In [107]:
# drop outliers and invalid Kb/Kf cells
# These may be caused by unphysical stakup model or calculation during post-processing
maxZVal = 200
minZVal = 10
stkTemp = stkData[(stkData['Z0SE(1_SE)'] < maxZVal) & \
                  (stkData['Z0SE(1_SE)'] > minZVal) & \
                  (np.abs(stkData['KBSENB(1_1)']) > 0.0) & \
                  (np.abs(stkData['KFSENB(1_1)']) > 0.0)]

# Check again to make sure data are now justified
stkTemp.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 9994 entries, 0 to 9999
Data columns (total 15 columns):
H1             9994 non-null float64
H3             9994 non-null float64
ER1            9994 non-null float64
ER3            9994 non-null float64
TD1            9994 non-null float64
TD3            9994 non-null float64
H2             9994 non-null float64
W              9994 non-null float64
S              9994 non-null float64
EDW            9994 non-null float64
Z0SE(1_SE)     9994 non-null float64
S0SE(1_SE)     9994 non-null float64
KBSENB(1_1)    9994 non-null float64
KFSENB(1_1)    9994 non-null float64
A(1_1)         9994 non-null float64
dtypes: float64(15)
memory usage: 1.2 MB
In [108]:
# now plot distributions again, should see proper distribuition now
stkData = stkTemp
stkData.hist(bins=50, figsize=(20,15))
save_fig("attribute_histogram_plots")
plt.show()
Saving figure attribute_histogram_plots
In [109]:
# find principal components for Z
corr_matrix = stkData.drop(columns=['KBSENB(1_1)', 'KFSENB(1_1)', 'S0SE(1_SE)', 'A(1_1)']).corr()
corr_matrix['Z0SE(1_SE)'].abs().sort_values(ascending=False)
Out[109]:
Z0SE(1_SE)    1.000000
W             0.664396
H1            0.535432
H3            0.344356
H2            0.186618
ER1           0.119390
ER3           0.110507
EDW           0.067261
TD3           0.061906
TD1           0.017937
S             0.000107
Name: Z0SE(1_SE), dtype: float64

From this correlation matrix above, it can be shown that trace width and height are dominate factors for the trace's impedance.

Choose a Model:

Since we are building a nominal estimator here, I will try simple linear regressor as estimator first:

In [110]:
# Separate input and output attributes
allTars = ['Z0SE(1_SE)', 'KBSENB(1_1)', 'KFSENB(1_1)', 'S0SE(1_SE)', 'A(1_1)']
varList = [e for e in list(stkData) if e not in allTars]
varData = stkData[varList]
In [111]:
# We have 10,000 cases here, try in-memory normal equation directly first:

# LinearRegression Fit Impedance
from sklearn.linear_model import LinearRegression

tarData = stkData['Z0SE(1_SE)']
lin_reg = LinearRegression()
lin_reg.fit(varData, tarData)

# Fit and check predictions using MSE etc
from sklearn.metrics import mean_squared_error, mean_absolute_error
predict = lin_reg.predict(varData)
resRMSE = np.sqrt(mean_squared_error(tarData, predict))
resRMSE
Out[111]:
3.478480854776983
In [112]:
# Use 10-Split for cross validations:
def display_scores(attribs, scores):
    print("Attribute:", attribs)
    print("Scores:", scores)
    print("Mean:", scores.mean())
    print("Standard deviation:", scores.std())
    
from sklearn.model_selection import cross_val_score
lin_scores = cross_val_score(lin_reg, varData, tarData,
                             scoring="neg_mean_squared_error", cv=10)
lin_rmse_scores = np.sqrt(-lin_scores)
display_scores(varList, lin_rmse_scores)
Attribute: ['H1', 'H3', 'ER1', 'ER3', 'TD1', 'TD3', 'H2', 'W', 'S', 'EDW']
Scores: [3.25291562 4.16881619 3.22429058 3.32251943 3.70096691 3.46812853
 3.34169206 3.5511994  3.28259776 3.39876687]
Mean: 3.4711893352193854
Standard deviation: 0.270957271481177
In [113]:
# try Regularization it self
from sklearn.linear_model import Ridge
ridge_reg = Ridge(alpha=1, solver="cholesky")
ridge_reg.fit(varData, tarData)
predict = ridge_reg.predict(varData)
resRMSE = np.sqrt(mean_squared_error(tarData, predict))
resRMSE
Out[113]:
3.4921877809274595

Thus a 3 ohms or so difference may be obtained from this estimator. What if higher order regression is used:

In [114]:
from sklearn.preprocessing import PolynomialFeatures
poly_features = PolynomialFeatures(degree=2, include_bias=False)
varPoly = poly_features.fit_transform(varData)
lin_reg = LinearRegression()
lin_reg.fit(varPoly, tarData)
predict = lin_reg.predict(varPoly)
resRMSE = np.sqrt(mean_squared_error(tarData, predict))
resRMSE
Out[114]:
1.2780298094097002

A more accurate model thus may be obtained this way.

Training and Evaluation:

In [115]:
from sklearn.metrics import mean_squared_error
from sklearn.model_selection import train_test_split

def plot_learning_curves(model, X, y):
    X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2, random_state=10)
    train_errors, val_errors = [], []
    for m in range(1, len(X_train)):
        model.fit(X_train[:m], y_train[:m])
        y_train_predict = model.predict(X_train[:m])
        y_val_predict = model.predict(X_val)
        train_errors.append(mean_squared_error(y_train_predict, y_train[:m]))
        val_errors.append(mean_squared_error(y_val_predict, y_val))

    plt.plot(np.sqrt(train_errors), "r-+", linewidth=2, label="Training set")
    plt.plot(np.sqrt(val_errors), "b-", linewidth=3, label="Validation set")
    plt.legend(loc="upper right", fontsize=14)
    plt.xlabel("Training set size", fontsize=14)
    plt.ylabel("RMSE", fontsize=14)

lin_reg = LinearRegression()
plot_learning_curves(lin_reg, varData, tarData)
plt.axis([0, 8000, 0, 20])
save_fig("underfitting_learning_curves_plot")
plt.show()
Saving figure underfitting_learning_curves_plot

Neural Network:

As the difference between prediction to actual measurement is about two ohms, it has met our modeling goals. As an alternative approacy, let's try neural net modeling below

In [116]:
from keras.models import Sequential
from keras.layers import Dense, Dropout

numInps = len(varList)
nnetMdl = Sequential()
# input layer
nnetMdl.add(Dense(units=64, activation='relu', input_dim=numInps))

# hidden layers
nnetMdl.add(Dropout(0.3, noise_shape=None, seed=None))
nnetMdl.add(Dense(64, activation = "relu"))
nnetMdl.add(Dropout(0.2, noise_shape=None, seed=None))
          
# output layer
nnetMdl.add(Dense(units=1, activation='sigmoid'))
nnetMdl.compile(loss='mean_squared_error', optimizer='adam')

# Provide some info
#from keras.utils import plot_model
#plot_model(nnetMdl, to_file= workDir + 'model.png')
nnetMdl.summary()
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_10 (Dense)             (None, 64)                704       
_________________________________________________________________
dropout_7 (Dropout)          (None, 64)                0         
_________________________________________________________________
dense_11 (Dense)             (None, 64)                4160      
_________________________________________________________________
dropout_8 (Dropout)          (None, 64)                0         
_________________________________________________________________
dense_12 (Dense)             (None, 1)                 65        
=================================================================
Total params: 4,929
Trainable params: 4,929
Non-trainable params: 0
_________________________________________________________________
In [117]:
# Prepare Training (tran) and Validation (test) dataset
varTran, varTest, tarTran, tarTest = train_test_split(varData, tarData, test_size=0.2)

# scale the data
from sklearn import preprocessing
varScal = preprocessing.MinMaxScaler()
varTran = varScal.fit_transform(varTran)
varTest = varScal.transform(varTest)

tarScal = preprocessing.MinMaxScaler()
tarTran = tarScal.fit_transform(tarTran.values.reshape(-1, 1))
In [118]:
hist = nnetMdl.fit(varTran, tarTran, epochs=50, batch_size=1000, validation_split=0.1)
tarTemp = nnetMdl.predict(varTest, batch_size=1000)
predict = tarScal.inverse_transform(tarTemp)
resRMSE = np.sqrt(mean_squared_error(tarTest, predict))
resRMSE
Train on 7195 samples, validate on 800 samples
Epoch 1/50
7195/7195 [==============================] - 0s 43us/step - loss: 0.0457 - val_loss: 0.0205
Epoch 2/50
7195/7195 [==============================] - 0s 4us/step - loss: 0.0202 - val_loss: 0.0150
Epoch 3/50
7195/7195 [==============================] - 0s 4us/step - loss: 0.0173 - val_loss: 0.0157
Epoch 4/50
7195/7195 [==============================] - 0s 4us/step - loss: 0.0169 - val_loss: 0.0134
Epoch 5/50
7195/7195 [==============================] - 0s 5us/step - loss: 0.0148 - val_loss: 0.0111
Epoch 6/50
7195/7195 [==============================] - 0s 6us/step - loss: 0.0129 - val_loss: 0.0094
Epoch 7/50
7195/7195 [==============================] - 0s 5us/step - loss: 0.0110 - val_loss: 0.0078
Epoch 8/50
7195/7195 [==============================] - 0s 4us/step - loss: 0.0093 - val_loss: 0.0061
Epoch 9/50
7195/7195 [==============================] - 0s 5us/step - loss: 0.0078 - val_loss: 0.0045
Epoch 10/50
7195/7195 [==============================] - 0s 4us/step - loss: 0.0063 - val_loss: 0.0034
Epoch 11/50
7195/7195 [==============================] - 0s 5us/step - loss: 0.0055 - val_loss: 0.0026
Epoch 12/50
7195/7195 [==============================] - 0s 4us/step - loss: 0.0047 - val_loss: 0.0021
Epoch 13/50
7195/7195 [==============================] - 0s 5us/step - loss: 0.0043 - val_loss: 0.0018
Epoch 14/50
7195/7195 [==============================] - 0s 6us/step - loss: 0.0039 - val_loss: 0.0017
Epoch 15/50
7195/7195 [==============================] - 0s 6us/step - loss: 0.0037 - val_loss: 0.0016
Epoch 16/50
7195/7195 [==============================] - 0s 5us/step - loss: 0.0034 - val_loss: 0.0015
Epoch 17/50
7195/7195 [==============================] - 0s 5us/step - loss: 0.0032 - val_loss: 0.0015
Epoch 18/50
7195/7195 [==============================] - 0s 5us/step - loss: 0.0030 - val_loss: 0.0015
Epoch 19/50
7195/7195 [==============================] - ETA: 0s - loss: 0.002 - 0s 8us/step - loss: 0.0029 - val_loss: 0.0015
Epoch 20/50
7195/7195 [==============================] - 0s 6us/step - loss: 0.0028 - val_loss: 0.0014
Epoch 21/50
7195/7195 [==============================] - 0s 6us/step - loss: 0.0028 - val_loss: 0.0014
Epoch 22/50
7195/7195 [==============================] - 0s 5us/step - loss: 0.0026 - val_loss: 0.0014
Epoch 23/50
7195/7195 [==============================] - 0s 6us/step - loss: 0.0025 - val_loss: 0.0013
Epoch 24/50
7195/7195 [==============================] - 0s 5us/step - loss: 0.0024 - val_loss: 0.0013
Epoch 25/50
7195/7195 [==============================] - 0s 4us/step - loss: 0.0024 - val_loss: 0.0013
Epoch 26/50
7195/7195 [==============================] - 0s 4us/step - loss: 0.0023 - val_loss: 0.0012
Epoch 27/50
7195/7195 [==============================] - 0s 4us/step - loss: 0.0023 - val_loss: 0.0012
Epoch 28/50
7195/7195 [==============================] - 0s 4us/step - loss: 0.0022 - val_loss: 0.0012
Epoch 29/50
7195/7195 [==============================] - 0s 4us/step - loss: 0.0022 - val_loss: 0.0012
Epoch 30/50
7195/7195 [==============================] - 0s 5us/step - loss: 0.0020 - val_loss: 0.0011
Epoch 31/50
7195/7195 [==============================] - 0s 5us/step - loss: 0.0021 - val_loss: 0.0011
Epoch 32/50
7195/7195 [==============================] - 0s 5us/step - loss: 0.0020 - val_loss: 0.0011
Epoch 33/50
7195/7195 [==============================] - 0s 5us/step - loss: 0.0020 - val_loss: 0.0011
Epoch 34/50
7195/7195 [==============================] - 0s 4us/step - loss: 0.0019 - val_loss: 0.0010
Epoch 35/50
7195/7195 [==============================] - 0s 4us/step - loss: 0.0018 - val_loss: 0.0010
Epoch 36/50
7195/7195 [==============================] - 0s 5us/step - loss: 0.0018 - val_loss: 9.9226e-04
Epoch 37/50
7195/7195 [==============================] - 0s 4us/step - loss: 0.0018 - val_loss: 9.8783e-04
Epoch 38/50
7195/7195 [==============================] - 0s 4us/step - loss: 0.0017 - val_loss: 9.5433e-04
Epoch 39/50
7195/7195 [==============================] - 0s 5us/step - loss: 0.0017 - val_loss: 9.5077e-04
Epoch 40/50
7195/7195 [==============================] - 0s 4us/step - loss: 0.0017 - val_loss: 9.2018e-04
Epoch 41/50
7195/7195 [==============================] - 0s 4us/step - loss: 0.0016 - val_loss: 9.1261e-04
Epoch 42/50
7195/7195 [==============================] - 0s 5us/step - loss: 0.0016 - val_loss: 9.0183e-04
Epoch 43/50
7195/7195 [==============================] - 0s 4us/step - loss: 0.0016 - val_loss: 8.8474e-04
Epoch 44/50
7195/7195 [==============================] - 0s 5us/step - loss: 0.0016 - val_loss: 8.6719e-04
Epoch 45/50
7195/7195 [==============================] - 0s 5us/step - loss: 0.0015 - val_loss: 8.4854e-04
Epoch 46/50
7195/7195 [==============================] - 0s 5us/step - loss: 0.0016 - val_loss: 8.2477e-04
Epoch 47/50
7195/7195 [==============================] - 0s 5us/step - loss: 0.0015 - val_loss: 8.0656e-04
Epoch 48/50
7195/7195 [==============================] - 0s 6us/step - loss: 0.0014 - val_loss: 8.1118e-04
Epoch 49/50
7195/7195 [==============================] - 0s 5us/step - loss: 0.0014 - val_loss: 7.8844e-04
Epoch 50/50
7195/7195 [==============================] - 0s 5us/step - loss: 0.0014 - val_loss: 7.9169e-04
Out[118]:
1.982441126153167
In [119]:
plt.plot(hist.history['loss'])
plt.title('Model loss')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['Train', 'Val'], loc='upper right')
plt.show()

With epoc increased to 100, we can even obtain 1.5 ohms accuracy. It seems this neural network model is comparable to the polynominal regressor and meet our needs.

In [120]:
# save model and architecture to single file
nnetMdl.save(workDir + "LStkMdl.h5")

# finally
print("Saved model to disk")
Saved model to disk

Deploy:

Using the SESL3 data set as an example, we follow the similar process and built 10+ prediction models for different stakup structure setup. The polynominal model or neural network can be implemented in Java/C++ to avoid dependencies on python's package for distribution purpose. The implemented front-end, shown below, provide a quick and easy method for system designer for stackup/routing planning: Deploy

Conclusion:

In this post/markdown document, we decribe the stackup modeling process using data science/machine learning techniques. The outcome is a deployed front-end with modeled neural network for user's instant performance evaluation. The data set and this markdown document is published on this project's git-hub page.

IBIS-AMI: Something about CTLE

Overview:

Continuous time linear equalizer, or CTLE for short, is a commonly used in modern communication channel. In a system where lossy channels are present, a CTLE can often recover signal quality for receiver or down stream continuous signaling. There have been many articles online discussing how a CTLE works theoretically. More thorough technical details are certainly also available in college/graduate level communication/IC design text book. In this blog post, I would like to focus more on its IBIS-AMI modeling aspect from a practical point of view. While not all secret sauce will be revealed here:-), hopefully the points mentioned here will give reader a good staring point in implementing or determining their CTLE/AMI modeling methodologies.

[Credit:] Some of the pictures used in this post are from Professor Sam Palermo’s course webpage linked below. He was also my colleague at Intel previously. (not knowing each other though..)

ECEN 720: High-Speed Links Circuits and Systems

 

What and why CTLE:

The picture above shows two common SERDES channel setups. While the one at the top has a direct connection between Tx and Rx, the bottom one has a “repeater” to cascade up stream and down stream channels together. This “cascading” can be repeated more than once so there maybe more than two channels involved. CTLE may sit inside the Rx of both set-ups or the middle “ReDriver” in the bottom one. In either case, the S-parameter block represents a generalized channel. It may contain passive elements such as package, transmission lines, vias or connectors etc. A characteristic of such channel is that it presents different losses across spectrum, i.e. dispersion.

For example, if we plot these channel’s differential input to differential output, we may see their frequency domain (FD) loss as shown above.

Digital signals being transmitted are more or less like sequence of bit/square pulse. We know that very rich frequency components are generated during its sharp rising/falling transition edges. Ideally, a communication channel to propagate these signals should behave like an (unit-gain) all pass filter. That is, various frequency components of the signal should not be treated equally, otherwise distortion will occur. Such ideal response can be indicated as the green box below:

In reality, such all pass filter does come often. In order to compensate our lossy channels (as indicated by the red box) to be more like the ideal case (green box) as an end result, we need to apply equalization (indicated by blue box). This is why an equalizer is often used… basically it provides a transfer function/frequency response to compensate the lossy channel in order to recover the signal quality. A point worth taken here is that the blue box and red box are “tie” together. So using same equalizer for channels of different losses may cause under or over compensated. That is, an equalizer is related to the channel being compensated. Another point is that CTLE is just a subset of such linear equalizer.

CTLE is a subset of linear equalizer:

A linear equalizer can be implemented in many different ways. For example, a feed-forward equalizer is often used in the Tx side and within DFE:

FFE’s behavior is more or less easier to predict and its AMI implementation is also quite straight forward. For example, a single pre-tap or post-tap’s FFE response can be easily visualized and predicted:

Now, a CTLE is a more “generalized” linear equalizer, so its behavior is usually represented in terms of frequency responses. Thus, to accommodate/compensate channels of different losses, we will have different FD responses for CTLE:

Now that IBIS-AMI modeling for CTLE is of concern, how do we obtain such modeling data for CTLE and how they should be modeled?

Different types of CTLE modeling data:

While CTLE’s behavior can be easily understood in frequency domain, for IBIS-AMI or channel analysis, it eventually needs to come back to time domain (FD) to convolve with inputs. This is because both statistical or bit-by-bit mode of link analysis are in time domain. Thus we have several choice: provide model FD data and have it converted to TD inside the implemented AMI model, or simply provide TD response directly to the model. The benefit of the first approach is that model can perform iFFT based on analysis’ bit rate and sampling rate’s settings. The advantage of the latter one is that the provided TD model can be checked to have good quality and model does not need to do similar iFFT every time simulation starts. Of course, the best implementation, i.e. like us SPISim’s approach, is to support both modes for best flexibility and expandability 🙂

  • Frequency domain data:

Depending on the availability of original EQ design, there are several possibilities for FD data: Synthesized with poles and zeros, extract from S-parameters or AC simulation to extract response.

  1. Poles and Zeros: Given different number of poles, zeros and their locations along with dc boost level, one can synthesize FD response curves:So say if we are given a data sheet which has EQ level of some key frequencies like below: Then one can sweep different number and locations of poles and zeros to obtain matching curves to meet the spec.:Such synthesized curves are well behaved in terms of passivity and causality etc,  and can be extended to covered desired frequency bandwidth.
  2. Extract from S-parameters: Another way to obtain frequency response is from EQ circuit’s existing S-parameter. This will provide best correlation scenarios for generated AMI model because the raw data can serve as a design target. However, there are many intermediate steps one have to perform first. For example, the given s-parameter may be single ended and only with limited frequency range (due to limitation of VNA being used), so if tool like our SPISim’s SPro is used, then one needs to: reording port (from Even-Odd ordering, i.e. 1-3, 2-4 etc to Sequential ordering, i.e. 1, 2 -> 3, 4), then convert to differential/mixed mode, after that extrapolate toward dc and high frequencies (many algorithms can be used and such extrapolation must also abide by physics) and finally extract the only related differential input -> differential output portion data.
  3. AC simulation: This assumes original design is available for AC simulation. Such raw data still needs to be sanity checked in terms of loss and phase change. For example, if gain are not flat toward DC and high-frequency range, then extra fixing may be needed otherwise iFFT results will be spurious.
  • Time domain data: time domain response can be obtained from aforementioned FD data by doing iFFT directly as shown below. It may also be obtained by simulating original EQ circuit in time domain. However, there are still several considerations:
  1. How to do iFFT: padding with zeros or conjugate are usually needed for real data iFFT. If the original FD data is not “clean” in terms of causality, passivity or asymptotic behavior, then they need to be fixed first.
  2. TD simulation: Is simulating impulse response possible? If not, maybe a step response should be performed instead. Then what is the time step or ramp speed to excite input stimuli? Note that during IBIS-AMI’s link analysis, the time step being used there may be different from the one being used here, so how will you scale the data accordingly. Once a step response is available, successive differentiation will produce impulse response with proper scaling.

How to implement CTLE AMI model:

Now that we have data to model, how will they be implemented in C/C++ codes to support AMI API for link analysis is another level of consideration.

  • Decision mechanism: As mentioned previously, a CTLE FD response targets at a channel of certain loss, thus the decision to use appropriate CTLE settings based on that particular channel at hand must either be decided by user or the model itself. While the former (user decision) does not need further explanation, the latter case (model decision, i.e. being adaptive) is a whole different topic and often vendor specific.

Typically, such adaptive mechanism has a pre-sorted CTLE in terms of strength or EQ level, then a figure-of-merit (FOM) needs to be extracted from equalized signal. That is, apply a tentative CTLE to the received data, then calculate such FOM. Then increase or decrease the EQ level by using adjacent CTLE curves and see whether FOM improves. Continue doing so until either selected CTLE “ID” settles or reach the range bounds. This process may be performed across many different cycles until it “stabilized” or being “locked”. Thus, the model may need to go through training period first to determine best CTLE being used during subsequent link analysis.

  • EQ configurations:

So now you have a bunch of settings or data like below, how should you architecture the model properly such that it can be extended in the future with revised CTLE response or allow user to perform corner selections (which essentially adds another dimension):

This is now more in software architecture domain and needs some trade-off considerations. For example, you may want to provide fine grid full spectrum FD/TD response but the data will may become to big. So internal re-sampling may be needed. For FD data, the model may needs to sample to have 2^N points for efficient iFFT. Different corner/parameter selection should not be hard coded in the models because future revised model’s parameter may be different. For external source data, encryption is usually needed to protect the modeling IP. With proper planning, one may reuse same CLTE module in many different design without customization on a case-by-case basis.

  • Correlations:

Finally it’s time to correlate the create CTLE AMI model against original EQ design or its behavioral model. Done properly, you should see signals being “recovered” from left to right below:

However, getting results like this in the first try may be a wishful thinking. In particular, the IBIS-AMI model does not work alone… it needs to work together with associated IBIS model (analog front-end) in most link simulator. So that IBIS model’s parasitics and loading etc will all affect the result. Besides, the high-impedance assumption of the AMI model also means proper termination matching is needed before one can drop them in for direct replacement of existing EQ circuit or behavioral models for correlation.

Summary:

At this point, you may realize that while a CTLE can be easily understood from its theoretic behavior perspective, its implementation to meet IBIS-AMI demands is a different story. We have seen CTLE models made by other vendor not expandable at all such that the client need to scratch the existing ones for minor revised CTLE behavior/settings (also because this particular model maker charges too much, of course). It’s our hope that the learning experience mentioned in this post will provide some guidance or considerations regardless when you decide to deep dive developing your own CTLE IBIS-AMI model, or maybe just delegate such tasks to professional model makers like us 🙂

Simulator development: Modeling (S & P)

System channel is usually represented in S-parameters. They can be extracted in frequency domain using a 3D field solver, and/or cascaded stage by stage using tool like SPISim’s SPro. With LTI (linear time invariant) assumption, it’s possible to synthesize eye or BER plot of millions of bits using statistically analysis… using single time domain pulse for these parameters. However, it’s still often desired to be able to simulate the s-parameter in time domain for defined bit patterns. Thus, a system simulator like our SSolve must be able to support such requirement. In addition, one may also want to know the frequency response when given a broadband spice converted spice elements, such as via, package or connector models. So the reversed process, i.e. extract S-parameters from spice elements, is also often required. In this post, we will briefly talk about how these may be developed in simulator like SSolver.

S-Element… S-Parameters:

There have been many conference and journal papers proposing different methods of simulating S-parameter in time domain. However, at the most basic level, S-parameter can be considered as a transfer function or filter block. Thus DSP techniques can apply: transfer function multiplying inputs in frequency domain can be converted into time domain using convolution:

Convolusion

The time domain at the right hand side can be further separated into two parts: history up to this time point (integrate from -infinity to t=n-1) and the value at this particular moment t=n due to input. The first part is a constant as it’s already happened in the past and can’t be changed, the second part is, however, input dependent and must be updated within the “solve” and “stamp” hot loop inside newton iteration. When putting together, they form a Norton circuit form of I = Y * V + J where Y is value affected at this moment and J, a constant, is due to past history. This Norton form can then be “stamped” accordingly for matrix solving.

Interested reader may refer to the paper published by HP linked below for detailed explanations and math:

Integration of Transient S-Parameter Simulation into HSpice

The equation [18] and [19] is the aforementioned Norton equivalent circuit form and can be used accordingly.

The convolution method needs to update history with the solved results of this time step for next time step to be used. In addition, the basic convolution requires that the dt to be constant, thus a variable time step simulation will be greatly hampered by this requirement in terms of performance. So the convolution modeling has rooms for improvement.

One of the possible approach is using vector fitting technique mentioned in previous post about “W-element” modeling. With the S-parameter data in frequency domain, one may construct a Pade approximation using several poles and zeros. Then basis functions can be created for each of the pole in time domain and simulate accordingly. A benefit of this process is that the constructed form is a rational function which is guaranteed to be causal. So if there are issue regarding causality of the provided s-parameter, it can be fixed during the modeling process. Lastly, due to exact fitting of multi-port s-parameter across the frequency spectrum are not likely, some sort of error minimization (in the MSE sense) is needed to have a balance between accuracy and number of poles.

 

P-Element… Port element:

Often times after a package, connector or via modeling engineer created a model, he or she will use tool like broadband spice to convert such 3D extracted s-parameters to spice equivalent circuit composed of various basic elements. When a system designer or SI engineer receive such converted models, the original S-parameter may not be available already. Rather than insert this model into the channel and simulate blindly, it’s often beneficial to be able to reconstruct and inspect the model’s frequency domain response first before actually using them. S-Parameter extraction via simulation is basically a form of AC simulation, thus with AC model of the system elements constructed, the S-Parameter extraction part becomes easy.

The context here is small signal S-parameter extraction, thus all the AC signal is done very close to the operating point. That is, a DC solution needs to be obtained first for each port’s respecitve bias condition, then the AC stimulus is applied and solve for each frequency point.

A “Port” or “P-element” has several properties: dc bias condition, reference impedance, port-name and port-order. For a multi-port s-parameter extraction, one port is excited at a time with specified dc bias value. AC sweep is then performed while the other ports are terminated to their reference impedance. The power wave of input and output, measured and processed using current injection and nodal voltage measured, can then obtained for this input to the other output ports. Using simple math described in the link below:

S-parameter measurements

one can obtain S-parameter of Sij (i is port with input stimulus, j are the other ports) content easily this way. Repeat the same process for the other ports one at a time (with their respective dc bias condition) and the full S-parameter can be obtained. Finally, the order of ports are arranged according to the port properties, their respective port name are written out at the top of the touch stone file and the process is complete. Should there be needs to convert to other formats such as Y, Z parameter (sometimes good for checking connectivity), one can do so easily with formula (assuming generalized 2-N ports) or simply use developed tool like our SPro.

 

Back to the top:

Back from these modeling physics to the computer science domain, one also need to consider the following topics when doing simulator development:

  • Memory pool management (allocation, expansion and clean-up)
  • Multi-threading consideration
  • Plug-in architecture for future devices
  • ….

While the list can go on and on and the tasks may be daunting, the end results are definitely worthy to the analysis flow and methodologies development. With developed simulator, there is no longer absolute need to form a close-loop formula or equation in order to solve circuit equation. The module or flow running on top can simply create netlist and have this simulator solved for you. Not to mentioned this also make maintenance and testing much easier. For an EDA company like us, I would say this is a journey worth taking.

Simulator development: Modeling (B & W)

When modeling device for a circuit simulator, the raw netlist input needs to be converted into internal structure first, then a physical model is constructed during the “modeling” phase, and corresponding equivalent Norton or Thevenin circuits’ parameter are solved within each Newton iteration at each timestep. The solved parameters are finally “stamped”  into system matrix for Newton iteration solving. “Model” and “Solve” is the essential part of device modeling for a circuit simulator and that’s whey “Physics” come into play.

In these two posts, we will briefly talk about how system devices, in particular IBIS, Transmission line and S-parameter are “modeled” and “solved”.

B-Element… IBIS:

Looking at the IBIS’s structure, the modeling part is actually quite straight forward:

IBISEvolve

The four IV curve data: pull-up, pull-down, power clamp and ground clamp act like non-linear resistors. With terminal voltage known within each Newton iteration, the conductance can be look-up from these curve tables and obtained using linear interpolation.

The switching coefficients and composite currents are both time dependent. Their values are calculated in the “modeling” phase when simulation has not even started. The obtained coefficients is a multiplier which will further scale the conductance calculated for IV data and thus stamped value. These scaling are such that when test load specified in the waveform section is connected, driver at the pin will reproduce exactly the same waveform data given in the model. As to C-Comp, it can be inserted using simulator’s existing infrastructure so the integrator there will manage the stamping and error prediction.

The more complicated portion of IBIS modeling inside a simulator is due to the options available for the end user, thus model developer must plan in advance. For example, the c_comp may be split across different terminals. Each waveform, IV or components have different skews which book-keeping codes must take care of. There might be added submodel for pre-emphasis or de-emphasis so the class implementation-wise one should consider “composite” pattern such that recursive inclusion can happen. At the end, this is a relative simple device to model for simulator, particular when comparing to the transmission line.

 

W-Element… Transmission line:

Every electromagnetic text book will give transmission line structure as shown below:

RLGC

This is an uniform distributed model and is implemented early in various simulators as  the “U” element. While implements T-Line model this way is now outdated due to the performance issue, it’s still how the T-Line’s raw data, frequency dependent tabular model, are given:

TabModel

The tabular model are field solved of Maxwell’s equations based on layer stackup, trace layout, material properties and sometime special treatment (like surface roughness) and finally presented as R/L/G/C data at low (DC), high (Infinity) frequencies and many points in between. Thus to model T-Line for a simulator to use, one has to convert these data to a mathematical form first which can then be used in either time or frequency domain. For transmission line, this can starts with Telegrapher’s equations.

By solving the KCL/KVL of a unit length RLGC circuit above, one can derive and find the telegrapher’s equation:

And the solution to this equations, as explained in the wiki link above, involve a wave propagation function Gamma:

RLGCEq2

When realize this in the system model, it as the Norton equivalent circuit form:

So on each side (near end and far end) of the transmission line, there are two components: Z (admittance) at that particular time step and current source due to the propagation delay originated from the other end. These two components (Z(s) and r(s)) can be obtained from the tabular data in frequency domain and then converted to integrate-able form in time domain so that they can be “stamped”. Generally, it includes the following steps:

  • Parse and store the raw tabular model;
  • Calculate the propagation delay and characteristic impedance using highest frequency data (or using extrapolation), these value will be used for interpolation later in time domain.
  • Construct the Z(s) or Y(s) and the wave function r(s) shown in the system model. As transmission lines are usually coupled, these curves are multi-dimensional in frequency domain;
  • Using vector fitting technique to represent this frequency domain functions using series of poles and zeros. In most of the cases, particular when the model data has insufficient bandwidth or low quality, exact fit is not possible with reasonable number of poles/zeros and thus best fit in the minimum-square-error sense needs to be performed.
  • Once pole and zeros are found, they can be converted into time domain as different order of terms. All these terms combined together will form the Y(t) or r(t) in time domain. Pade’s approximation may be used here.
  • During time domain simulation, use interpolation to find r(t)’s value in the past history (propagation delay ago) and use that data to construct the equivalent model of this end at this particular time point.
  • For frequency domain analysis, vector fitting and conversion to integral form is not needed. The Y(s) and r(s) data can be used directly for stamping at this frequency with some interpolation.

For the first three steps, I wrote a simple matlab codes to demonstrate how how they are done:

Impedance function:MatlabY

Propagation function:MatlabH

Plots:PlotY

PlotH

While the matlab codes above seems straightforward,  most simulator (including SPISim’s SSolver) will program with native codes (C/C++) for performance consideration. So a whole lots of matrix operation (inverse, eigen value, LU decomposition etc) will also come into play during the process.

It’s rarely the case that the developed model or codes will work the first try. With so many terms (of converted poles and zeros) for so many dimensions (coupled cases), it’s a daunting task to figure out what has gone wrong when waveform is not as expected. It’s often simpler to back to basic: check steady state solution first, use one line, no reflection with by matching impedance at the other end, use characteristic impedance to find nominal reflection value and so on to help identifying the issue:

TLDebug

Interested reader may find more details about SPISim’s implementation, same as HSpice’s, in the following paper:

“Optimal Transient Simulation of Transmission Lines” by Dmitri Borisovich Kuznetsov, Jose E. Schutt-Aine, IEEE Trans. on Circuit and Systems, I Feb. 1996

In terms of book, I have found that Chap. 5 of Dan Oh’s “High-speed Signaling” book, S6 in our reference book section, give best explanations among others. This maybe because Mr. Oh is from UIUC around the same time when the paper was published as well 🙂 It also worth mentioning that similar technique can be applied to other passive, homogeneous device modeling, such as the system channel. For example,  one common approach of checking and fixing casual issue of a s-parameter is by using vector fitting and convert to rational function form.