# On-demand serum-free media formulations for human hematopoietic cell expansion using a high dimensional search algorithm

Bioinformatics

### Compounding culture factor combinations for TF-1 cells

The epMotion 5070 liquid hander (Eppendorf) was used to compound the culture condition “recipe” according to the prescribed algorithm-generated test formulations. Each factor was diluted to the appropriate concentration in a base media of DMEM (Gibco #12430054) supplemented with 1% Penicillin-Streptomycin (Pen-Strep; Gibco #15140122) and distributed into 48-well plates. The 15 factors selected to supplement the base media were Glycogen synthase kinase inhibitor (CHIR99021)19,31, Jun N-terminal kinase inhibitor (SP600125)32, dexamethasone (Dexameth)4, granulocyte macrophage-colony-stimulating factor (rhGM-CSF), stem cell factor (rhSCF)33, insulin-like growth factor 1 (rhIGF-1)4, ascorbic acid (AA)4, Rho kinase inhibitor (Y27632)19, albumin (ALB)33, fibronectin (FN)4, GlutaMAX™ Supplement (GL)4,5, cholesterol concentrate (CH)4, ITS Supplement (ITS)4,33, β-mercaptoethanol (bME)4,33, and sodium pyruvate (PY). The dose ranges tested for each factor are listed in Supplementary Table 1.

### TF-1 cell maintenance

TF-1 cells (ATCC #CRL-2003) were maintained in recommended complete growth medium (RPMI 1640 (Gibco #22400089) supplemented with 10% FBS (Gibco #12483020), 2 ng per ml recombinant human Granulocyte-macrophage colony-stimulating factor (rhGM-CSF; R&D Systems #215-GM-010), and 1% Pen-Strep in T25 or T75 flasks at 37 °C with 5% CO2.

### T-cells preparation

T-cells were isolated from peripheral blood mononuclear cell (PBMC). Buffy coats (Canadian Blood Services; Donor #2, donation # C0510172128482; Donor #3, donation # C05101721879820) were diluted with equal parts volume of sterile buffer of Dulbecco’s phosphate-buffered saline (DPBS, GE Healthcare) with 2% Human Serum AB (Gemini Bio-products). PBMCs were isolated using Ficoll-paque PLUS (GE Healthcare) according to manufacturer guidelines. The cells were washed by centrifugation once again, and resuspended in 10 ml of complete growth media (XVIVO CGM) consisting of XVIVO-15 (Lonza) supplemented with 1% PenStrep, 5% human serum, 1% SG-200 (GE Healthcare), and 350 IU per ml recombinant human Interleukin 2 (IL-2, GE Healthcare). The cells prepared in the individual tubes were combined into a single tube and XVIVO CGM added to a final volume of 80 ml.

CD-positive (CD3+) T-cells were isolated from PBMCs, either from Buffy Coats or LeukoPak apheresis units (Stem Cell Technologies; Donor #1), as a CD3+ depletion fraction of a CD3-CD56+ NK-cell isolation process. The sequential separation of cell populations utilized positive selection of the CD3+ fraction using magnetic microbeads and the CliniMACS® System or MACS® Columns (Miltenyi Biotec) according to manufacturer instructions. The isolated T cells were resuspended in CryoStor-10 (CS10, BioLife Solutions) cryopreservation media at aliquot sizes of 20 × 106 or 40 × 106 cells.

Canadian Blood Services approved REB application for work using donor blood material conducted at CCRM, an approved CL2 facility with appropriate guidance and procedures complying with all relevant ethical regulations, for research purposes.

### Compounding culture factor combinations for T-cells

The factors were compounded using the Nimbus Microlab Liquid Handling System (Hamilton Robotics) in basal media of DMEM/F12 + 1% PenStrep. Cryopreserved CD3+ cells from DN1 were thawed in 10 ml of basal media supplemented with 7% Bovine Serum Albumin (Sigma; stock solution prepared to 200 mg per ml in DPBS). The cells were centrifuged (all centrifugation steps hereon at 400 g for 10 min unless otherwise specified) and the pellet washed and resuspended in plating media (basal media supplemented with 2% Human Serum Albumin (Sigma; stock solution prepared to 200 mg per ml in DPBS)). The cells were then counted and resuspended at target density for seeding in plating media and activated with the addition of CD3/CD28/CD2 T-cell activator (Stem Cell Technologies) according to manufacturer dosage instructions. The 14 factors selected to supplement the base media were β-mercaptoethanol (bME)34,35,36, LS1000 Lipid Supplement (LS1000)37, sodium pyruvate (PY)34,38, Insulin-Transferrin-Selenium-Ethanolamine (ITS -X)35,39, albumin (rhALB)35,37,40,41, MEM non-essential amino acids solution (MEMAA)37,39, L-arginine (ARG)38, SG-200 solution (GLU)38,42, Cell Boost™ 6 (CN-T) supplement (CB6), IL-2 growth factor (rhIL-2)34,38,43, Interleukin 12 (rhIL-12)34,38, Interleukin 18 (rhIL-18)34, Interleukin 21 (rhIL-21)44, and MEM vitamin solution (VS)37. The dose ranges tested for each factor are listed in Supplementary Table 2.

### TF-1 test combination culture

Upon completion of the compounding of culture factor cocktails using the liquid handler, the cells were washed three times and resuspended in DMEM+1% Pen-Strep. The cell suspension was allocated to each well at a seeding density of 30,000 cells per ml and a total culture volume of 500 μl per well which were incubated for 5 days. The serum-containing culture condition (usual supplements added to a base medium of DMEM instead of RPMI 1640) was used as the “Positive Control” (PC) condition.

### TF-1 live cell count

The cell suspension was dissociated using TrypLE (Gibco), transferred into 96-well V-bottom plates, washed with PBS, and resuspended in HBSS+2% FBS with 1:1000 7-Aminoactinomycin D (7-AAD; Molecular Probes). The numbers of live cells in each well were counted using the HTS platform on the BD LSRFortessa flow cytometer (BD Biosciences) (Supplementary Fig. 9).

### Live cell count and T-cell phenotype characterization

On day 5, the culture plates were centrifuged and washed with DPBS to remove remnants of the culture media. The cells were incubated for 10 min with 30 μl TrypLE (Thermo Fisher) without fully dissociating the aggregates. All test wells with the exception of the unstained sample were resuspended with 70 μl Flow buffer (DPBS+2% HS+1 mM Ethylenediaminetetraacetic acid solution (EDTA, Sigma)) including 1:1000 7-AAD. The aggregates were fully dissociated by gentle pipetting just prior to initiation of the count protocol where the number of viable cells was counted using the CytoFlex (Beckman Coulter) in plate mode, sampling at 90 μl per min for 40 s.

The selected formulations and cells were prepared as illustrated in previous sections. On day 5, the culture plates were centrifuged and washed with DPBS to remove remnants of the culture media. The cells were incubated for 30 min at 4 °C in the dark with 50 μl T-cell flow cytometry panel master mix. After the incubation period, 100 μl DPBS was added to each well and the plates centrifuged. The cells were resuspended in 100 μl Flow buffer and the aggregates fully dissociated by gentle pipetting just prior to initiation of the count protocol where the number of viable cells were counted using the CytoFlex (Beckman Coulter) in plate mode, sampling at 90 μl per min for 40 s (Supplementary Fig. 10).

### Coding and statistical analyses

The algorithm was written in MATLAB (Mathworks) and executed on a Windows 8.1 device. The initial conditions of the native Differential Evolution parameters governing mutation (F) and crossover (CR) (Supplementary Fig. 1) were defined as F = 1 and CR = 0.5. The values of these parameters were changed according to the progression of the optimization, where F was reduced to 0.5 upon detection of convergence of the overall score. Following the reduction in F, CR was reduced to 0.25 when at least half of the elements in the target formulation (Xi in Supplementary Fig. 1) were not changed between two consecutive generations. For the in silico and in vitro TF-1 cell media formulation optimization runs, a population size of 45 generated by 3 × D (where D corresponded to the number of factors, 15) was used. For the in vitro T-cell media formulation run, a population size of 59 (3 × D + 17 extra test formulations to utilize the increased capacity of the liquid handling platform) was used. The in silico validation of the algorithm, including generation of simulated response data points, lasted for ~40 min for each run. The comparison of HD-DE with random selection (Supplementary Fig. 11) used the same benchmark as the in silico runs. For the in vitro experiments, the optimization parameters were defined within MATLAB and the algorithm was used to generate the test combinations. The reagent transfer commands and subsequent results of the in vitro culture were directly transferred from and from the MATLAB environment to the liquid handler interface and from the flow cytometer software into MATLAB in the form of CSV files. The post hoc multivariable analysis and principal component analysis (PCA) were performed using JMP12 (SAS) on the same Windows 8.1 device. For the multivariable analysis, all tested formulations excluding the initial populations from all 3 TF-1 cell experimental runs were combined to generate the dataset. For T-cells, only one dataset was available. Using JMP12, the response was then log-transformed. Imputation of left-censored data was performed by estimating a normal distribution of the response below count sensitivity threshold. A quadratic polynomial model was fitted by least square regression using the response screening platform in JMP. The equation included all quadratic (square) terms and all two-factor interactions (crossproduct) terms in equation (1):

$$Y = K + Gp + mathop {sum }limits_{j = 1}^D beta _jx_j + mathop {sum }limits_{i = 1}^D mathop {sum }limits_{j = 1}^D beta _{ij}x_ix_j + mathop {sum }limits_{j = 1}^D beta _{jj}x_j^2 + {it{epsilon }},$$

(1)

where Y corresponded to the log-transformed values of cell expansion, K corresponded to the intercept, Gp was the block parameter for the pth generation, D corresponded to the number of factors, βj corresponded to the main effect coefficient for factor j, βij corresponded to the interaction effect coefficient between factors i and j, xj corresponded to the coded dose [−1, 1] for factor j, xi corresponded to the coded dose [−1,1] for factor i, and ε corresponded to the random error (residuals). The statistical significance of the regression coefficient estimates were false discovery rate (FDR)-corrected45 for p-values <0.05. The values for the βi, βij, βjj and corresponding FDR p-values are provided for each experiment in Supplementary Data 3. PCA was conducted on the compiled dataset of formulations of the final candidate solution set from all 3 experimental runs.

### Algorithm analysis and selection

Multiple competition rounds between formulations within and across generations was incorporated for formulation selection and induction into the candidate solution set. All analyses were completed by the algorithm upon input of test response (viable cell number count), generating either output of test formulations for the next generation or determining the termination or completion of the optimization process. The competition required was composed of three main elements. The first element, is competition within test generation where the target formulations versus trial formulations within each generation were compared according to the Wilcoxon rank-sum test. The winning individual member of the population advanced to next round. The second element is competition against best encountered where the formulations previously selected in the candidate solution set were compared with any new formulations identified with consideration given to the estimated inter-experimental variability to select the better of the two. The third element was clearing/niching where, at later generations, formulations that scored outside of a pre-defined threshold of a 10% score range of candidate solution set formulations were actively replaced out by a clearing mechanism. A combination of stochastic selection and deterministic perturbation of the composition of root formulations was used to generate a pool of candidate formulations from which a test formulation was randomly selected.

### Similarity analysis

Two metrics analyzing similarity were adapted to assess the degree of similarity between two formulations at consecutive generations. First, the Hamming distance46 counted as the number of factors in a query formulation with dose level designation not equal to the dose of the same factor in the reference formulation. Second, the Levenshtein distance47 measured as the sum of dose levels difference between the query and reference formulations overall factors. Formally, the Levenshtein distance counts the number of edits (substitution, insertion, or deletion of an element) required to change one formulation sequence to the other. As the formulations being compared are of equal length, counting insertion, or deletion edits become meaningless. Unlike comparison of two text sequences, where such similarity measurements are often used, each element of the formulation can vary across a range of doses coded and represented as 0, 1, 2, 3, … This aspect of dose levels for each element was introduced to the substitution count of the Levenshtein distance measurement by considering the discrepancy in the number of doses between the corresponding elements in the query and reference formulations. For the purposes of formulation analysis, the Levenshtein-equivalent distance (hereon referred to as ‘Levenshtein distance’) measured the total number of dose level discrepancies across all elements of the formulation sequence.

### Code availability

The custom code is available at GitHub (https://github.com/julieaudet/cell-manufacturing).

### Reporting summary

Further information on experimental design is available in the Nature Research Reporting Summary linked to this article.