Research Article

Austin Biom and Biostat. 2016; 3(1): 1032.

# A Note on the Required Sample Size of Model-Based Dose-Finding Methods for Molecularly Targeted Agents

Sato H^{1}*†, Hirakawa A^{2}*† and Hamada C^{3}

^{1}Biostatistics Group, Center for Product Evaluation,
Pharmaceuticals and Medical Devices Agency, Japan

^{2}Statistical Analysis Section, Center for Advanced
Medicine and Clinical Research, Nagoya University
Hospital, Japan

^{3}Department of Information and Computer Technology,
Tokyo University of Science, Japan
†These authors contributed equally to this work.

***Corresponding author: **Sato H, Biostatistics Group,
Center for Product Evaluation, Pharmaceuticals and
Medical Devices Agency, 3-3-2 Kasumigaseki, Chiyodaku,
Tokyo 100-0013, Japan

**Received: **November 07, 2016; **Accepted: **December 05, 2016; **Published: **December 14, 2016

## Abstract

Some Molecularly Targeted Agents (MTAs) exhibit non-monotonic patterns in the dose-response relationships. Although many model-based dose-finding methods to account for such patterns have been proposed, the required sample size to determine the true Optimal Dose (OD) has not been adequately investigated. A little knowledge of the required sample size might potentially prevent wide-ranging application of model-based dose-finding methods in practice. In this study, we focus on three model-based dose-finding methods that accommodate non-monotonic patterns in the dose-efficacy relationship, and discuss the required sample sizes under various conditions, using simulation studies. We found that the selection rate of the true OD did not necessarily improve as the sample size increased. Based on the results of our simulation studies, we provide notes and guidelines on sample size determination when using model-based dose-finding methods for MTAs.

**Keywords:** Change-point model; Sample size; Dose-finding; Oncology;
Phase I

## Abbreviations

AR: Adaptive Randomization; CP: Change Point; CP method: dose-finding method proposed by Sato et al.; CRM: Continual Reassessment Method; MCMC: Markov Chain Monte Carlo; MTD: Maximum Tolerated Dose; MTAs: Molecularly Targeted Agents; OD: Optimal Dose; TC method: dose-finding method proposed by Thall and Cook; WMD: Weighted Mahalanobis Distance; WT method: dose-finding method proposed by Wages and Tait

## Introduction

The objective of phase I oncology trials is generally to determine the Maximum Tolerated Dose (MTD). This is defined as the highest dose level that can be administered to patients with clinically acceptable toxicity. The dose-finding methods for determining the MTD are roughly categorized into two groups, model-based and rule-based methods. Rule-based methods, such as the 3+3 design, are widely used in practice, but the lack of statistical rationale and low accuracy of determining the true MTD are often problematic. Many model-based dose-finding methods, such as the Continual Reassessment Method (CRM) [1], assume that the probabilities of toxicity and efficacy of an agent increase monotonically as the dose of the agent increases; therefore, dose escalation or de-escalation is commonly based solely on toxicity outcome. Such methods outperform rule-based methods in many cases [2-4].

Some Molecularly Targeted Agents (MTAs) exhibit nonmonotonic patterns in dose-efficacy relationships. Therefore, the model-based dose-finding method based on the above-mentioned assumptions may not be reasonable for determining the Optimal Dose (OD) of MTAs. To account for non-monotonic patterns in the dose-efficacy relationships of MTAs, dose-finding methods that account for both toxicity and efficacy outcomes are required. Such methods generally determine the OD based on toxicity and efficacy outcomes. The OD is often considered to be the dose level with the maximum efficacy probability among the dose levels with toxicity probabilities lower than a pre-specified value (e.g., 30 or 40%), although the definition of the OD varies depending on the individual method proposed. Many researchers have developed dose-finding methods based on toxicity and efficacy outcomes for single-agent or two-agent combination phase I trials [5-11]. Thall and Cook [6] proposed using the Gumbel model [12] to capture the relationship between the bivariate binary toxicity and efficacy outcomes (termed the TC method). They used a quadratic model for the dose-efficacy relationship in order to consider a non-monotonic pattern. Wages and Tait [11] proposed using a power model for the binary efficacy and toxicity outcomes (termed the WT method). They assumed some class of working model for the efficacy outcome and used model selection techniques to allow greater flexibility in modeling the doseefficacy relationship. Recently, we developed a new dose-finding method using the Change-Point (CP) logistic model for single MTA trials (termed the CP method) [13]. Specifically, we developed a doseefficacy model, the parameters of which are allowed to change in the vicinity of the change point of the dose level, in order to address non-monotonic patterns of the dose-efficacy relationship. The change point is defined as the dose that maximizes the log-likelihood of the assumed dose-efficacy and dose-toxicity models.

Although many useful dose-finding methods have been proposed that account for non-monotonic patterns of the dose-efficacy relationship for MTAs, the required sample size for determining true the OD using these methods has not been adequately investigated. For instance, the selection rate for the true OD is generally evaluated using fixed sample sizes in simulation studies [6,11,13], but the required sample size to achieve the target selection rate for the true OD is not. Thus, little is known about the required sample size for the existing dose-finding methods for MTA. It is useful for investigators to provide the required sample sizes to use novel model-based dosefinding methods under various conditions (e.g., number of dose levels evaluated and prior distribution for model parameters). In this study, we focus on the three model-based dose-finding methods that can be used for MTA (i.e., the CP, TC, and WT methods), and discuss the required sample size to determine the true OD under various conditions, using simulation studies. Based on the results of the simulation studies, we provide notes and guidelines for determining the sample size for model-based dose-finding methods for MTAs.

This paper is organized as follows: in the next section, we provide an overview of the three dose-finding methods. The simulation studies are described in the third section, and we discuss the determination of the required sample size and provide guidelines for determining the sample size for model-based dose-finding methods for MTAs in the fourth section.

## Dose-Finding Methods Used

## An adaptive dose-finding method for a MTA using the Change-Point model (CP method)

Let Y_{Ei} and Y_{Ti} denote binary efficacy and toxicity outcomes for
the ith of N patients, respectively. Y_{Ei}(orY_{Ti})= 1 indicates that efficacy
(or toxicity) is observed, and Y_{Ei}(orY_{Ti}) = 0 otherwise. Following Islam
et al. [14], the joint probabilities for Y_{Ei} and Y_{Ti} are given in Table 1.

**Table 1:**The joint probabilities for Y

_{Ei}and Y

_{Ti}.

Y_{Ti}

0

1

Y_{Ei}0

p_{00}

p_{01}1-

p_{E}1

p_{10}

p_{11}

p_{E}

1-

p_{T}

p_{T}1

Table 1:The joint probabilities for Y_{Ei}and Y_{Ti}.

To model the toxicity outcomes, the bivariate joint probability
function for Y_{Ei} and Y_{Ti} is factorized into the conditional probability
of toxicity given an efficacy outcome Pr(Y_{Ti}= k| Y_{Ei}= j; k,j = 0,1) and
the marginal probability of efficacy Pr(Y_{Ei}= j; j = 0,1) as follows:

$\mathrm{Pr}\left({y}_{Ei},{y}_{Ti}\right)={\displaystyle \prod}_{j=0}^{1}{\displaystyle \prod}_{k=0}^{1}{\pi}_{jk}^{{y}_{ijk}}\uff1d{\displaystyle \prod}_{j=0}^{1}{\displaystyle \prod}_{k=0}^{1}{\left[\mathrm{Pr}\left({Y}_{Ti}=k|{Y}_{Ei}=j\right)\mathrm{Pr}\left({Y}_{Ei}=j\right)\right]}^{{y}_{ijk}}\text{(1)}$

where

y_{i00} = (1-y_{Ei})(1-y_{Ti}), j=0, k=0,

y_{i01} = (1-y_{Ei})y_{Ti}, j=0, k=1,

y_{i10} = y_{Ei}(1-y_{Ti}), j=1, k=0, and

y_{i11} = y_{Ei}y_{Ti}, j=1, k=1.

The conditional probability functions of toxicity given each efficacy outcome are modeled by an ordinary logistic model, that is,

$\mathrm{Pr}\left({Y}_{Ti}=1|{Y}_{Ei}=0\right)={\pi}_{T|{Y}_{E}=0}\left({x}_{i};{\theta}_{0}\right)=\frac{\mathrm{exp}\left({\alpha}_{0}+{\beta}_{0}{x}_{i}\right)}{1+\mathrm{exp}\left({\alpha}_{0}+{\beta}_{0}{x}_{i}\right)}\text{(2)}$ and

$\mathrm{Pr}\left({Y}_{Ti}=1|{Y}_{Ei}=1\right)={\pi}_{T|{Y}_{E}=1}\left({x}_{i};{\theta}_{1}\right)=\frac{\mathrm{exp}\left({\alpha}_{1}+{\beta}_{1}{x}_{i}\right)}{1+\mathrm{exp}\left({\alpha}_{1}+{\beta}_{1}{x}_{i}\right)}\text{(3)}$

where x_{i}={d_{1},…,d_{L}} is an actual dose of the agent administered to the
ith patient, θ_{0}= {a_{0},β_{0}} and θ_{1}= {a_{1},β_{1}} are unknown parameters for the models in Equations (2) and (3), respectively. Given the actual dose
d_{l}(l=1,…,L), we consider the standardized dose ${d}_{l}^{\text{'}}=\mathrm{log}\left({d}_{l}\right)-{L}^{-1}{\displaystyle \sum}_{l=1}^{L}\mathrm{log}\left({d}_{l}\right)$It
should be noted that these conditional models are equal (i.e., θ_{0}= θ_{1})
under independence of efficacy and toxicity [14].

Next, we propose a CP logistic model for modeling the marginal probability function for efficacy, as follows:

$\mathrm{Pr}\left({Y}_{Ei}=1\right)={\pi}_{E}\left({x}_{i}\right)=\{\begin{array}{c}{\pi}_{E}\left({x}_{i};{\theta}_{E}\right)=\frac{\mathrm{exp}\left({\alpha}_{E}+{\beta}_{E}{x}_{i}\right)}{1+\mathrm{exp}\left({\alpha}_{E}+{\beta}_{E}{x}_{i}\right)},{x}_{i}\le {d}^{*}\\ {\pi}_{E}\left({x}_{i};{\theta}_{E}^{\text{'}}\right)=\frac{\mathrm{exp}\left({\alpha}_{E}^{\text{'}}+{\beta}_{E}^{\text{'}}{x}_{i}\right)}{1+\mathrm{exp}\left({\alpha}_{E}^{\text{'}}+{\beta}_{E}^{\text{'}}{x}_{i}\right)},{x}_{i}>{d}^{*}\end{array}\text{(4)}$

where d* is the change point of the dose between ${d}_{1}^{\text{'}},\dots ,{d}_{L-1}^{\text{'}}$ and
θ_{E}={a_{E},β_{E}} and $$\theta {\text{'}}_{E}=\left\{{\alpha}_{E}^{\text{'}},{\beta}_{E}^{\text{'}}\right\}$$ are unknown parameters.

For the current data of n patients Dn, we calculate the likelihoods
under the assumptions of ${d}^{*}={d}_{1}^{\text{'}},\dots ,{d}_{L-1}^{\text{'}}$ respectively, that is
${\mathcal{L}}_{n,l}\left({\theta}_{l}|{D}_{n},{d}^{*}={d}_{l}^{\text{'}}\right)$ where ${\theta}_{l}=\{{\theta}_{0l},{\theta}_{1l},{\theta}_{El},{\theta}_{El}^{\text{'}}\}$In the Bayesian inference for
θ_{l}, we assume that the prior distribution for each parameter f(θ_{l}) is
an independent normal distribution, although other distributions
can be used. For each Ln,l(l=1,…,L-1), the posterior distribution of
θ_{l} is given by $f\left({\theta}_{l}|{D}_{n},{d}^{*}={d}_{l}^{\text{'}}\right)\propto f\left({\theta}_{l}\right){\mathcal{L}}_{n,l}\left({\theta}_{l}|{D}_{n},{d}^{*}={d}_{l}^{\text{'}}\right).$ Using the Markov
chain Monte Carlo (MCMC) method, we obtain the posterior mean ∧θ
l for each θ_{l}.

Owing to the ease of use, we used the method of Rukhin [15] to determine the change point. Given the posterior mean, ${\hat{\theta}}_{l}(l=1,\dots ,L-1)$ we determine the estimated change point of${\stackrel{\u0303}{d}}^{*}$that provides the maximum value among,$\mathrm{log}{\mathcal{L}}_{n,l}\left({\hat{\theta}}_{l}|{D}_{n},{d}^{*}={d}_{l}^{\text{'}}\right),$that is,

${\stackrel{\u0303}{d}}_{\square}^{*}=\mathrm{arg}\underset{{d}_{1}^{\text{'}}\le {d}^{*}\le {d}_{L-1}^{\text{'}}}{\mathrm{max}}\left\{\mathrm{log}{\mathcal{L}}_{n,l}\left({\hat{\theta}}_{l}|{D}_{n},{d}^{*}={d}_{l}^{\text{'}}\right)\right\}.\text{(5)}$

**Dose allocation algorithm in the CP method:** To stabilize the
parameter estimations for θ_{l} and d* at an early stage of the trial,
we incorporate the run-in period when the first cohort of patients
is treated at the lowest dose level and escalate the dose level unless
more than or equal to two of three patients in that cohort experience
toxicity. A cohort consists of three patients throughout.

After the run-in period, we start the model-based dose-finding stage. Using the estimated change point of${\stackrel{\u0303}{d}}^{*}$and the corresponding posterior means of ${\hat{\theta}}_{l}$ we calculate the posterior probabilities of efficacy and toxicity outcomes for each dose$({d}_{l}^{\text{'}},l=1,\dots ,L)$ which are denoted as${\hat{\pi}}_{E}\left({d}_{l}^{\text{'}}\right)\text{and}{\hat{\pi}}_{T}\left({d}_{l}^{\text{'}}\right)(={\hat{\pi}}_{E}\left({d}_{l}^{\text{'}}\right)\times {\hat{\pi}}_{T|{Y}_{E}=1}\left({d}_{l}^{\text{'}}\right)+\left\{1-{\hat{\pi}}_{E}\left({d}_{l}^{\text{'}}\right)\right\}\times {\hat{\pi}}_{T|{Y}_{E}=0}\left({d}_{l}^{\text{'}}\right))$respectively. To avoid allocating ineffective or severely toxic dose levels, we determine the set of acceptable doses based on these probabilities, as follows [6]:

$T\left({d}_{l}^{\text{'}}\right)=\left\{{d}_{l}^{\text{'}}|\mathrm{Pr}\left({\hat{\pi}}_{E}\left({d}_{l}^{\text{'}}\right)>{c}_{E}\right)>{\delta}_{E}\text{and}\mathrm{Pr}\left({\hat{\pi}}_{T}\left({d}_{l}^{\text{'}}\right){c}_{T}\right){\delta}_{T},l=1,\dots ,L\right\}\text{(6)}$

where c_{E} and c_{T} are the respective critical values for the posterior
probabilities of efficacy and toxicity outcomes, and d_{E} and d_{T} are fixed
probability cutoffs. That is, we extract the doses that are expected to
be effective and not severely toxic at a certain level.

Among the doses $T\left({d}_{l}^{\text{'}}\right)$ we select the dose that is allocated to the next cohort of patients based on the Weighted Mahalanobis Distance (WMD) proposed by Hirakawa [8]. We obtain the kth posterior samples, which are generated by the MCMC method, of the WMD of the outcome$\left({\pi}_{E}^{\left(k\right)}\left({d}_{l}^{\text{'}}\right),{\pi}_{T}^{\left(k\right)}\left({d}_{l}^{\text{'}}\right)\right)$ to the optimal point (1,0):

${m}^{\left(k\right)}\left({d}_{l}^{\text{'}}\right)=\sqrt{\frac{{w}_{E}^{2}{u}_{E}{\left({d}_{l}^{\text{'}}\right)}^{2}-2\rho \left({d}_{l}^{\text{'}}\right){w}_{E}{w}_{T}{u}_{E}\left({d}_{l}^{\text{'}}\right){u}_{T}\left({d}_{l}^{\text{'}}\right)+{w}_{T}^{2}{u}_{T}{\left({d}_{l}^{\text{'}}\right)}^{2}}{1-\rho {\left({d}_{l}^{\text{'}}\right)}^{2}}}\text{(7)}$

where

${u}_{E}\left({d}_{l}^{\text{'}}\right)=\frac{1-{\pi}_{E}^{\left(k\right)}\left({d}_{l}^{\text{'}}\right)}{\sqrt{{\pi}_{E}^{\left(k\right)}\left({d}_{l}^{\text{'}}\right)\left(1-{\pi}_{E}^{\left(k\right)}\left({d}_{l}^{\text{'}}\right)\right)}}\text{(8)}$

${u}_{T}\left({d}_{l}^{\text{'}}\right)=\frac{0-{\pi}_{T}^{\left(k\right)}\left({d}_{l}^{\text{'}}\right)}{\sqrt{{\pi}_{T}^{\left(k\right)}\left({d}_{l}^{\text{'}}\right)\left(1-{\pi}_{T}^{\left(k\right)}\left({d}_{l}^{\text{'}}\right)\right)}}\text{(9)}$

and wE and wT are the prespecified weight parameters for adjusting the trade-off between efficacy and toxicity, respectively. $\rho \left({d}_{l}^{\text{'}}\right)$ Denote the correlation coefficient described in Islam et al. [14].

The posterior mean of the WMD is given by averaging the posterior samples, that is,

$\stackrel{}{\overline{m}}\left({d}_{l}^{\text{'}}\right)=\frac{1}{K}{\displaystyle \sum}_{k=1}^{K}{m}^{\left(k\right)}\left({d}_{l}^{\text{'}}\right)\text{(10)}$

The dose with the minimum value of $\stackrel{}{\overline{m}}\left({d}_{l}^{\text{'}}\right)$ among $T\left({d}_{l}^{\text{'}}\right)$ is allocated to the next cohort of patients. If there is no acceptable dose at an interim time point, then the trial is terminated at that time point and no dose is selected as the OD. Otherwise, we apply this algorithm until reaching the maximum sample size and then select the dose allocated to the next cohort of patients as the OD.

## Dose-finding based on efficacy-toxicity trade-offs (TC method)

Thall and Cook [6] formulate the marginal probability of toxicity
p_{T}(d',θ_{T}) and efficacy p_{E}(d',θ_{E}) as follows:

${\pi}_{T}\left({d}_{}^{\text{'}},{\theta}_{T}\right)=\frac{\mathrm{exp}\left({\mu}_{T}+{d}_{}^{\text{'}}{\beta}_{T}\right)}{1+\mathrm{exp}\left({\mu}_{T}+{d}_{}^{\text{'}}{\beta}_{T}\right)}\text{(11)}$

${\pi}_{E}\left({d}_{}^{\text{'}},{\theta}_{E}\right)=\frac{\mathrm{exp}\left({\mu}_{E}+{d}_{}^{\text{'}}{\beta}_{E,1}+{d}_{}^{{\text{'}}^{2}}{\beta}_{E,2}\right)}{1+\mathrm{exp}\left({\mu}_{E}+{d}_{}^{\text{'}}{\beta}_{E,1}+{d}_{}^{{\text{'}}^{2}}{\beta}_{E,2}\right)}\text{(12)}$

where θ_{T}=(μ_{T},β_{T}) and θ_{E}=(μ_{E},β_{E},1,β_{E},2) are unknown parameters. They
propose using a Gumbel model [12] in the form of:

p_{a,b}=Pr(Y_{E}=a,Y_{T}=b|d',θ)

$={\left({\pi}_{E}\right)}^{a}{\left(1-{\pi}_{E}\right)}^{1-a}{\left({\pi}_{T}\right)}^{b}{\left(1-{\pi}_{T}\right)}^{1-b}+{\left(-1\right)}^{a+b}{\pi}_{E}\left(1-{\pi}_{E}\right){\pi}_{T}\left(1-{\pi}_{T}\right)\left(\frac{{e}^{\psi}-1}{{e}^{\psi}+1}\right)\text{(13)}$

for a,b∈{0,1} and the association parameter Ψ. Thus,
θ=(μ_{T},β_{T},μ_{E},β_{E},1,β_{E},2,Ψ).

Denoting the data for the first n patients in the trial as D_{n}, they
calculate the likelihood L_{n}(D_{n}|θ). They assume each component θ_{q} of θ
is normally distributed, with mean ${\stackrel{\u0303}{\mu}}_{q}$and standard deviation ${\stackrel{\u0303}{\sigma}}_{q}$ Let$\xi =\left({\stackrel{\u0303}{\mu}}_{1},{\stackrel{\u0303}{\sigma}}_{1},{\stackrel{\u0303}{\mu}}_{2},{\stackrel{\u0303}{\sigma}}_{2},\dots ,{\stackrel{\u0303}{\mu}}_{6},{\stackrel{\u0303}{\sigma}}_{6}\right)$ denote the vector of hyperparameters, with all
prior covariance set equal to 0, and let φ(θ|ξ) denote the multivariate
normal prior of θ. To compute posteriors, they numerically integrate
Ln(D_{n}|θ)φ(θ|ξ) with respect to θ using the method of Monahan and
Genz [16].

Dose-finding algorithm in the TC method: The first cohort is treated at the starting dose specified by the physician. They define the set of acceptable doses based on the probabilities shown in Equation (6). For each subsequent cohort, ${d}_{l}^{\text{'}}$ satisfies Equation (6), or if${d}^{\text{'}}$ is the lowest untried dose above the starting dose and it satisfies Pr $\mathrm{Pr}\left({\hat{\pi}}_{T}\left({d}_{l}^{\text{'}}\right)<{c}_{T}\right)>{\delta}_{T},$ then ${d}^{\text{'}}\in T\left({d}_{l}^{\text{'}}\right)$

The dose-finding algorithm is based on explicit trade-offs between
p_{E} and p_{T}. They construct a target efficacy-toxicity trade-off contour,
C, by fitting a curve to target values of $\left\{{\pi}_{1}^{*},{\pi}_{2}^{*},{\pi}_{3}^{*}\right\}$ that the physician
considers equally desirable. Once C is established, they use it to define the desirability of any pair of probabilities $q=\left({\hat{\pi}}_{E}\left({d}_{l}^{\text{'}}\right),{\hat{\pi}}_{T}\left({d}_{l}^{\text{'}}\right)\right)$ as follows.
Draw a straight line,* Line(q)*, from *q* to (1, 0), to find the point p where
*Line (q)* intersects C. Calculate the Euclidean distances *ρ(p)* from *p*
to (1, 0), and *ρ(q)* from *q* to (1, 0). To reflect the fact that values of *q*
closer to (1, 0) are more desirable, they define the desirability of *q* to
be D*(q)=ρ(p)/ρ(q)-1.*

If $T\left({d}_{l}^{\text{'}}\right)=\varphi $ then the trial is terminated and no dose is selected. Otherwise, the dose that maximizes D(q) is selected among the doses ${d}_{l}^{\text{'}}\in T\left({d}_{l}^{\text{'}}\right)$ subject to the constraint that no untried dose may be skipped when escalating. This algorithm is applied until the maximum sample size is reached.

## Model-selecting dose-finding method (WT method)

Wages and Tait [11] formulate the marginal probability of toxicity
${\pi}_{T}\left({d}_{l}\right)=F\left({d}_{l},\theta \right)={q}_{l}^{\mathrm{exp}\left(\beta \right)}$ where 0 < q_{1} < … < q_{L} < 1 are standardized
units (the skeleton) representing discrete dose levels d_{l}, l=1,…,L. At
the same time, they make use of some classes of working models and
model selection techniques in order to allow for more flexibility in
modeling the dose-efficacy relationship. They specify K = 2 × L - 1
working models; L unimodal skeletons, with nodes at each of the L
doses, and L - 1 plateau skeletons, with nodes at each of the first L - 1
doses. For a particular skeleton, k; k=1,…,K, they model the marginal
probability of efficacy ${\pi}_{E}\left({d}_{l}\right)={G}_{k}\left({d}_{l},\theta \right)={p}_{lk}^{\mathrm{exp}\left(\theta \right)}$ for a class of working
dose-efficacy models and ∈θ. Here, 0<p_{1k}<…<p_{Lk}<1 is the skeleton of
model k. Further, they account for any prior information concerning
the plausibility of each model, and so introduce Pr(Model_{k}) such that $\sum}_{k=1}^{K}\mathrm{Pr}\left(Mode{l}_{k}\right)=1$

They estimate the parameters β and θ based on the Bayesian
framework. For the current dataof n patients D_{n}, to estimate the
parameter β, they calculate the likelihood L(β|D_{n}), and utilize a
normal prior distribution g(β). For L(β|D_{n}), the posterior distribution
of β is given by g(β|D_{n})∝g(β)L(β|D_{n}). To estimate the parameter θ, the
likelihood under model k is given by L_{k}(θ|D_{n}), and utilizes a normal
prior h(θ). Given the set D_{n} and the likelihood, the posterior density
for θ is given by h_{k}(θ|D_{n})∝h(θ)L_{k}(θ|D_{n}). This information can be used
to establish posterior model probabilities given the data as

$\text{PMP}\left(Mode{l}_{k}\right)=\mathrm{Pr}\left(Mode{l}_{k}|{D}_{n}\right)=\frac{\mathrm{Pr}\left(Mode{l}_{k}\right){{\displaystyle \int}}_{\text{\Theta}}^{\square}{\mathcal{L}}_{k}\left(\theta |{D}_{n}\right)h\left(\theta \right)d\theta}{{{\displaystyle \sum}}_{k=1}^{K}\mathrm{Pr}\left(Mode{l}_{k}\right){{\displaystyle \int}}_{\text{\Theta}}^{\square}{\mathcal{L}}_{k}\left(\theta |{D}_{n}\right)h\left(\theta \right)d\theta}\text{(14)}$

The prior model probabilities, Pr(Model_{k}), are updated with the
efficacy data. Each time a new patient is to be enrolled, they choose
a single skeleton, k*, with the largest posterior probability such that

k*=arg max_{k} PMP(Model_{k}), (15)

They then utilize ${G}_{{k}^{*}}\left({d}_{l},\theta \right)$ to generate efficacy probability estimates at each dose. Beginning with the prior for θ and having included the jth subject, they can compute the posterior probability of a response for dl so that

${\stackrel{}{\stackrel{\wedge}{\pi}}}_{E}\left({d}_{l}\right)={G}_{{k}^{*}}\left({d}_{l},{\stackrel{}{\stackrel{\wedge}{\theta}}}_{{k}^{*}}\right)={p}_{l{k}^{*}}^{\mathrm{exp}\left({\stackrel{}{\stackrel{\wedge}{\theta}}}_{j{k}^{*}}\right)}\text{(16)}$

Dose-finding algorithm in the WT method: Overall, each enrolled patient is allocated the dose estimated to be the most efficacious, among those with acceptable toxicity. In general, after n enrolled patients, they define the set of acceptable doses as

${A}_{n}=\left\{{d}_{l}:{\stackrel{}{\stackrel{\wedge}{\pi}}}_{T}\left({d}_{l}\right)\le {\varphi}_{T}\right\}\text{(17)}$

where ø_{T} is the maximum acceptable toxicity rate.

Early in the trial, they do not rely entirely on the maximization
of estimated efficacy probabilities for guidance as to the most
appropriate treatment but rather implement Adaptive Randomization
(AR) to obtain broader information. Based on the estimated efficacy
probabilities,${\stackrel{}{\stackrel{\wedge}{\pi}}}_{E}\left({d}_{l}\right)$ for doses in A_{n}, a randomization probability R_{l} is calculated:

${R}_{l}=\frac{{\stackrel{}{\stackrel{\wedge}{\pi}}}_{E}\left({d}_{l}\right)}{{{\displaystyle \sum}}_{{d}_{l}\in {A}_{n}}{\stackrel{}{\stackrel{\wedge}{\pi}}}_{E}\left({d}_{l}\right)}\text{(18)}$

and the next patient or cohort of patients is randomized to dose d_{l}
with probability R_{l}. They rely on this randomization algorithm for the
subset of n_{R} patients. Further, the starting dose with a probability R_{l} is
chosen based on the starting skeleton, k*, for efficacy.

Upon completion of the AR phase, the trial design switches to
a maximization phase, in which maximized efficacy probability
estimates guide allocation. Among the doses contained in A_{n}, they
allocate the (n + 1) th patient cohort to the dose x_{n+1} according to the
estimated efficacy probabilities, p_{E}(d_{l}), such that

${x}_{n+1}=\mathrm{arg}{\mathrm{max}}_{{d}_{l}\in {A}_{n}}{\stackrel{}{\stackrel{\wedge}{\pi}}}_{E}\left({d}_{l}\right)\text{(19)}$

If the stopping rules (the details can be found in Wages and Tait [11]) take effect at an interim time point, then the trial is terminated at that time point and no dose is selected as the OD. Otherwise, this algorithm is continued until the maximum sample size is reached.

## Simulation Studies

## Common settings for the three dose-finding methods

We considered two actual dose sets in a single-agent dosefinding
trial: six actual doses d_{l}={1,2,3,4,5,6} and four actual doses
dl={1,2,3,4}. Given these actual doses, the standardized doses were
${d}_{l}^{\text{'}}=\{-1.097,-0.403,0.002,0.290,0.513,0.695\}$for the six actual doses, and ${d}_{l}^{\text{'}}=\{-0.795,-0.101,0.304,0.592\}$ for the four actual doses [6]. The starting dose was set as the lowest dose ${d}_{1}^{\text{'}}$

We investigated the ten different scenarios with respect to the true probabilities of efficacy and toxicity for the dose levels, ${\pi}_{T}\left({d}_{l}^{\text{'}}\right)\text{and}{\pi}_{E}\left({d}_{l}^{\text{'}}\right)$ (Table 2). The dose-efficacy and dosetoxicity relationships based on ${\pi}_{T}\left({d}_{l}^{\text{'}}\right)\text{and}{\pi}_{E}\left({d}_{l}^{\text{'}}\right)$ are shown in Figure 1. In each scenario, the conditional probabilities $\mathrm{Pr}\left({Y}_{E}=1|{Y}_{T}=0,{d}_{l}^{\text{'}}\right)\text{and}\mathrm{Pr}\left({Y}_{E}=1|{Y}_{T}=1,{d}_{l}^{\text{'}}\right)$ had to be specified and were calculated by substituting true ${\pi}_{T}\left({d}_{l}^{\text{'}}\right),\text{}{\pi}_{E}\left({d}_{l}^{\text{'}}\right)\text{}and\text{}\rho \left({d}_{l}^{\text{'}}\right)=\rho =0.20$ into the following equations, although these are not shown in this paper:

**Table 2:**True values of (p

_{T}, p

_{E}), Weighted Mahalanobis Distance (WMD), and trade-off value for each dose level. The OD is shown in bold.

Dose level

1

2

3

4

5

6

Scenario 1

p_{T},p_{E}0.05, 0.05

0.10, 0.20

0.15, 0.35

0.20, 0.500.25, 0.45

0.30, 0.40

WMD4.50

2.14

1.54

1.231.37

1.53

Trade-off value-0.22

-0.03

0.17

0.350.28

0.21

Scenario 2

p_{T},p_{E}0.05, 0.05

0.10, 0.15

0.15, 0.25

0.25, 0.600.45, 0.65

0.65, 0.70

WMD4.50

2.52

1.90

1.111.30

1.66

Trade-off value-0.22

-0.09

0.04

0.460.37

0.15

Scenario 3

,0.05, 0.05

0.10, 0.25

0.15, 0.500.20, 0.45

0.25, 0.40

0.30, 0.35

WMD4.50

1.87

1.181.33

1.48

1.66

Trade-off value-0.22

0.04

0.360.29

0.22

0.15

Scenario 4

,0.05, 0.05

0.15, 0.30

0.25, 0.550.35, 0.57

0.45, 0.59

0.55, 0.61

WMD4.50

1.70

1.191.27

1.37

1.52

Trade-off value-0.22

0.10

0.410.39

0.33

0.24

Scenario 5

p_{T},p_{E}0.05, 0.30

0.15, 0.700.25, 0.60

0.35, 0.50

0.45, 0.40

0.55, 0.30

WMD1.62

0.861.11

1.38

1.70

2.10

Trade-off value0.10

0.610.46

0.31

0.16

0.01

Scenario 6

p_{T},p_{E}0.05, 0.30

0.08, 0.680.22, 0.70

0.35, 0.72

0.55, 0.74

0.75, 0.76

WMD1.62

0.820.94

1.08

1.38

1.96

Trade-off value0.10

0.590.58

0.50

0.28

0.03

Scenario 7

,0.05, 0.05

0.10, 0.30

0.15, 0.50

0.20, 0.80

WMD4.50

1.66

1.18

0.79

Trade-off value-0.22

0.10

0.36

0.69

Scenario 8

p_{T},p_{E}0.05, 0.05

0.10, 0.30

0.30, 0.550.80, 0.65

WMD4.50

1.66

1.242.31

Trade-off value-0.22

0.10

0.39-0.04

Scenario 9

p_{T},p_{E}0.05, 0.25

0.15, 0.650.40, 0.50

0.65, 0.10

WMD1.83

0.931.44

3.61

Trade-off value0.04

0.550.29

-0.24

Scenario 10

p_{T},p_{E}0.05, 0.25

0.20, 0.650.50, 0.68

0.80, 0.71

WMD1.83

0.991.35

2.26

Trade-off value0.04

0.540.33

-0.03

Table 2:True values of (p_{T}, p_{E}), Weighted Mahalanobis Distance (WMD), and trade-off value for each dose level. The OD is shown in bold.

**Figure 1:**Ten simulation scenarios. The Optimal Dose (OD) is indicated by the dose level enclosed in a square.

Figure 1:Ten simulation scenarios. The Optimal Dose (OD) is indicated by the dose level enclosed in a square.

$\mathrm{Pr}\left({Y}_{E}=1|{Y}_{T}=0,{d}_{l}^{\text{'}}\right)=\frac{\rho \sqrt{{\pi}_{E}\left({d}_{l}^{\text{'}}\right)\left(1-{\pi}_{E}\left({d}_{l}^{\text{'}}\right)\right){\pi}_{T}\left({d}_{l}^{\text{'}}\right)\left(1-{\pi}_{T}\left({d}_{l}^{\text{'}}\right)\right)}-{\pi}_{E}\left({d}_{l}^{\text{'}}\right)\left(1-{\pi}_{T}\left({d}_{l}^{\text{'}}\right)\right)}{{\pi}_{T}\left({d}_{l}^{\text{'}}\right)-1}\text{(20)}$

$\mathrm{Pr}\left({Y}_{E}=1|{Y}_{T}=1,{d}_{l}^{\text{'}}\right)=\frac{\rho \sqrt{{\pi}_{E}\left({d}_{l}^{\text{'}}\right)\left(1-{\pi}_{E}\left({d}_{l}^{\text{'}}\right)\right){\pi}_{T}\left({d}_{l}^{\text{'}}\right)\left(1-{\pi}_{T}\left({d}_{l}^{\text{'}}\right)\right)}+{\pi}_{E}\left({d}_{l}^{\text{'}}\right){\pi}_{T}\left({d}_{l}^{\text{'}}\right)}{{\pi}_{T}\left({d}_{l}^{\text{'}}\right)}\text{(21)}$

Using the ten scenarios, we assessed the selection rates of the true OD given a sample size N of 36, 48, 60, and 72 for scenarios 1-6, and 24, 30, 36, and 42 for scenarios 7-10. The number of patients allocated to each dose level was set to 3. Each simulation study consisted of 1,000 trials.

## Settings for the CP method

We used the PROC MCMC procedure in SAS, version 9.3 (SAS Institute Inc., Cary, NC, USA) to obtain the posterior distributions of parameters. The method for the specification of hyper parameters for the prior normal distribution

$(i.e.,{\alpha}_{0}~N\left({\hat{\eta}}_{0},{\sigma}^{2}\right),{\beta}_{0}~N\left({\hat{\xi}}_{0},{\sigma}^{2}\right),{\alpha}_{1}~N\left({\hat{\eta}}_{1},{\sigma}^{2}\right),{\beta}_{1}~N\left({\hat{\xi}}_{1},{\sigma}^{2}\right),$

${\alpha}_{E}~N\left({\hat{\eta}}_{E},{\sigma}^{2}\right),{\beta}_{E}~N\left({\hat{\xi}}_{E},{\sigma}^{2}\right),{\alpha}_{E}^{\text{'}}~N\left({\hat{\eta}}_{E}^{\text{'}},{\sigma}^{2}\right),\text{and}{\beta}_{E}^{\text{'}}~N\left({\hat{\xi}}_{E}^{\text{'}},{\sigma}^{2}\right))$

was described by Sato et al. [13]. We considered six sets of prior efficacy and toxicity probabilities of dose levels (Table 3) and a correlation coefficient of Ψ1=Ψ=0.20 (see Equation (9) in Sato et al. [13]) to generate the mean of the prior normal distribution for all hyper parameters for the prior normal distribution. The standard deviation values were commonly set to 3.0. Using these values, we evaluated the effects of the hyper parameters for the prior normal distribution (Table 4) on the selection rate for the true OD of the CP method.

**Table 3:**The prior toxicity and efficacy probabilities (q

_{1},p

_{1}).

Setting

d_{1}

d_{2}

d_{3}

d_{4}

d_{5}

d_{6}

1(0.05, 0.05)

(0.10, 0.20)

(0.15, 0.35)

(0.20, 0.50)

(0.25, 0.55)

(0.30, 0.60)

2(0.05, 0.30)

(0.10, 0.70)

(0.15, 0.60)

(0.20, 0.50)

(0.25, 0.40)

(0.30, 0.30)

3(0.05, 0.05)

(0.10, 0.20)

(0.15, 0.35)

(0.20, 0.50)

(0.25, 0.65)

(0.30, 0.80)

4(0.05, 0.05)

(0.15, 0.25)

(0.25, 0.40)

(0.35, 0.55)

5(0.05, 0.30)

(0.15, 0.70)

(0.25, 0.50)

(0.35, 0.30)

6(0.05, 0.05)

(0.15, 0.50)

(0.25, 0.55)

(0.35, 0.60)

Table 3:The prior toxicity and efficacy probabilities (q_{1},p_{1}).

**Table 4:**Mean and standard deviation of the prior normal distribution in the CP method.

Prior settingMean

?

1-2.191

0.936

-0.951

0.353

-0.599

2.114

-0.375

1.123

3.0

2-2.662

1.800

-1.160

1.005

0.274

0.732

-0.833

0.546

3.0

3-2.315

0.717

-0.983

0.298

-0.736

2.308

-0.736

2.308

3.0

4-1.535

1.483

-0.554

0.630

-0.024

2.663

-0.410

2.107

3.0

5-1.732

2.141

-0.604

1.416

1.834

2.445

0.006

-2.945

3.0

6-1.700

0.603

-0.672

1.432

1.714

4.248

0.199

0.712

3.0

Table 4:Mean and standard deviation of the prior normal distribution in the CP method.

The weight parameters wE and wT for the WMD were set to 1.0. The value of the true WMD shown in Table 2 was obtained by substituting true ${\pi}_{T}\left({d}_{l}^{\text{'}}\right)\text{and}{\pi}_{E}\left({d}_{l}^{\text{'}}\right)$ into Equations (8) and (9), and $\rho \left({d}_{l}^{\text{'}}\right)=\rho =0.20$ into Equation (7). The critical values for the posterior probabilities of efficacy and toxicity cE and cT were set to 0.20 and 0.40, respectively, and fixed probability cutoffs dE and dT were both set to 0.10.

## Settings for the TC method

We used the publically released software "EffTox" (version 4.0.12), downloaded from https://biostatistics.mdanderson.org/ SoftwareDownload/Default.aspx. Using the EffTox software, we achieved a fair comparison between the three methods (the details can be found in Sato et al. [13]). The hyper parameters of the prior distribution with respect to the model parameters were automatically calculated depending on prior efficacy and toxicity probabilities (Table 3) and effective sample size. Effective sample size was set to 0.90 based on the recommendation of the software developer. EffTox requires specification of a trade-off value, which was termed the "desirability parameter" in the original paper by Thall and Cook [6]. We set

$\{{\pi}_{1}^{*},{\pi}_{2}^{*},{\pi}_{3}^{*}\}=\{({\pi}_{E1}^{*},0),(1,{\pi}_{T2}^{*}),({\pi}_{E3}^{*},{\pi}_{T3}^{*})\}=\{(0.22,0),(1,0.78),(0.27,0.51)\}$

to obtain the equalized Euclidean distance from the respective
points on the trade-off contour to the point of (1, 0) and obtained
the true trade-off value shown in Table 2 by inputting the true
${\pi}_{T}\left({d}_{l}^{\text{'}}\right),\text{}{\pi}_{E}\left({d}_{l}^{\text{'}}\right)\text{and}\left({\pi}_{E}^{*},{\pi}_{T}^{*}\right)$ into the EffTox software. The critical
values for the posterior probabilities of efficacy and toxicity c_{E} and c_{T}
were set to 0.20 and 0.40, respectively, and fixed probability cutoffs d_{E}
and dT were both set to 0.10.

## Settings for the WT method

We used the R code released at https://faculty.virginia.
edu/model-based_dose-finding/Wages%20and%20Tait%20
R%20code. R to perform the WT method. We set the skeleton
values for the marginal probability of toxicity q=(q_{1},q_{2},q_{3},q_{4},q_{5}
,q_{6})=(0.01,0.08,0.15,0.22,0.29,0.36) for scenarios 1-6, and q=(q_{1},q
_{2},q_{3},q_{4})=(0.05,0.15,0.25,0.35) for scenarios 7-10. In scenarios 1-6,
we set eleven skeletons for the marginal probability of efficacy
p_{k}=(p_{1k},p_{2k},p_{3k},p_{4k},p_{5k},p_{6k}), k=1,…,11 as follows:

p_{1}=(0.60,0.50,0.40,0.30,0.20,0.10),

p_{2}=(0.50,0.60,0.50,0.40,0.30,0.20),

p_{3}=(0.40,0.50,0.60,0.50,0.40,0.30),

p_{4}=(0.30,0.40,0.50,0.60,0.50,0.40),

p_{5}=(0.20,0.30,0.40,0.50,0.60,0.50),

p_{6}=(0.10,0.20,0.30,0.40,0.50,0.60),

p_{7}=(0.20,0.30,0.40,0.50,0.60,0.60),

p_{8}=(0.30,0.40,0.50,0.60,0.60,0.60),

p_{9}=(0.40,0.50,0.60,0.60,0.60,0.60),

p_{10}=(0.50,0.60,0.60,0.60,0.60,0.60), and

p_{11}=(0.60,0.60,0.60,0.60,0.60,0.60).

Additionally, in scenarios 7-10, we set seven skeletons
p_{k}=(p_{1k},p_{2k},p_{3k},p_{4k}), k=1,…,7 as follows:

p_{1}=(0.60,0.45,0.30,0.15),

p_{2}=(0.45,0.60,0.45,0.30),

p_{3}=(0.30,0.45,0.60,0.45),

p_{4}=(0.15,0.30,0.45,0.60),

p_{5}=(0.30,0.45,0.60,0.60),

p_{6}=(0.45,0.60,0.60,0.60), and

p_{7}=(0.60,0.60,0.60,0.60).

We assumed that each of these models was equally likely and set
Pr(Model_{k})= 1/11 in scenarios 1-6, and Pr(Model_{k})= 1/7 in scenarios
7-10. We set the normal prior distribution g(β) and h(θ) with mean0
and variance 1.34 as prior settings 1 and 4, with mean0 and variance
3 as prior settings 2 and 5, and with mean0 and variance 0.5 as prior
settings 3 and 6. The size of the adaptive randomization phase was
set equal to half of the maximum sample size. The critical values for
the posterior probabilities of efficacy and toxicity c_{E} and c_{T} were set to
0.20 and 0.40, respectively.

## Simulation results

Table 5 shows the selection rates for the true OD of each dosefinding method for each prior setting under scenarios 1-6 with six dose levels, which are displayed in Figure 2. As shown in Figure 2, the selection rates for true OD of the three methods were almost constant, irrespective of the prior settings. Across prior settings 1-3, the average increase in the selection rate for the true OD when the sample size was increased from 36 to 72 was 5.6%, -0.3%, and 4.5% for the CP, TC, and WT methods, respectively. In the TC and WT methods, the selection rate for the true OD decreased as the sample size increased in some cases. In scenario 2, in which the probabilities of toxicity and efficacy of an agent roughly and monotonically increase as the dose of the agent increases, the average increase in the selection rate for the true OD when the sample size was increased from 36 to 72 was 11.7%, 7.7%, and 8.0% for the CP, TC, and WT methods, respectively.

**Table 5:**The selection rate (%) of the true OD in Scenarios 1-6.

Prior setting 1

Prior setting 2

Prior setting 3

N36

48

60

72

36

48

60

72

36

48

60

72

Scenario 1

CP39

39

42

44

37

42

40

45

35

38

41

40

TC9

7

5

5

10

7

7

7

9

8

5

4

WT38

45

47

50

35

45

44

51

40

46

51

51

Scenario 2

CP55

63

62

65

59

64

66

71

52

57

61

65

TC44

51

50

53

52

56

58

62

47

50

46

51

WT44

43

46

50

37

42

45

49

44

48

48

50

Scenario 3

CP41

43

46

44

44

50

48

52

36

38

42

43

TC2

2

2

2

5

4

3

3

3

2

2

1

WT41

44

50

49

46

48

49

52

41

41

46

48

Scenario 4

CP47

49

50

51

49

50

52

52

41

45

42

46

TC25

22

24

23

28

26

29

26

24

25

21

23

WT46

44

46

48

43

46

46

46

46

43

45

48

Scenario 5

CP64

63

67

70

69

65

69

73

64

61

63

63

TC24

23

26

24

25

28

28

26

24

24

25

24

WT69

68

70

72

65

65

67

69

67

71

68

74

Scenario 6

CP49

51

50

52

52

52

56

54

49

49

50

53

TC13

12

11

11

14

14

14

11

13

11

11

10

WT40

35

35

34

39

34

30

32

45

39

36

34

Table 5:The selection rate (%) of the true OD in Scenarios 1-6.

**Figure 2:**The selection rate of the true OD in scenarios 1-6.

Figure 2:The selection rate of the true OD in scenarios 1-6.

Table 6 shows the selection rates for the true OD of each dosefinding method for each prior setting under scenarios 7-10 with four dose levels, which are displayed in Figure 3. The selection rate for the true OD of the three methods increased slightly as the sample size increased. Across prior settings 5-7, the average increase in the selection rate for the true OD when the sample size was increased from 24 to 42 was 5.3%, 7.3%, and 9.5% for the CP, TC, and WT methods, respectively. For all three methods, the magnitude of the increase in the selection rate for the true OD as the sample size increased was similar, irrespective of prior settings and scenarios 7-10.

**Table 6:**The selection rate (%) of the true OD in Scenarios 7-10.

Prior setting 4

Prior setting 5

Prior setting 6

N24

30

36

42

24

30

36

42

24

30

36

42

Scenario 7

CP82

83

84

87

62

62

63

68

74

74

77

75

TC86

89

90

91

83

86

89

90

85

85

91

91

WT64

65

69

74

59

61

64

69

66

71

75

80

Scenario 8

CP74

77

82

82

74

74

76

79

83

78

91

80

TC88

92

93

95

88

91

94

94

89

92

93

95

WT71

75

78

82

65

69

73

78

75

78

83

84

Scenario 9

CP72

74

77

78

79

80

82

84

76

78

78

81

TC44

48

48

51

47

52

52

55

46

46

47

54

WT79

83

87

89

77

83

86

88

80

86

87

88

Scenario 10

CP74

77

82

84

78

83

85

85

77

83

83

85

TC68

71

75

78

73

76

81

80

70

77

76

80

WT73

73

77

77

73

75

76

78

72

76

77

81

Table 6:The selection rate (%) of the true OD in Scenarios 7-10.

**Figure 3:**The selection rate of the true OD in scenarios 7-10.

Figure 3:The selection rate of the true OD in scenarios 7-10.

## Discussion

In this study, we assessed the relationship between the selection rate for the true OD and sample size in three model-based dosefinding methods that account for a non-monotonic pattern of doseefficacy curve, using simulation studies. According to the report of Le Tourneau et al. [17], the sample sizes used in phase I trials evaluating four and six dose levels are 20-30 and 40-50, respectively. We, therefore, evaluated the selection rate for the true OD of the three methods using sample sizes of 24-42 for the scenarios with four dose levels, and of 36-72 for the scenarios with six dose levels.

The simulation studies revealed several important findings with respect to the relationship between the selection rate for the true OD and sample size. First, the selection rate for the true OD did not substantially improve as the sample size increased when the number of dose levels was six, even if the sample size was doubled. The selection rate for the true OD in the best-performing method under each scenario was 50-70% at maximum when the sample size ranged from 36-72. This result may suggest that the model-based dose-finding methods we used in this study cannot capture the complex dose-efficacy and dose-toxicity curves using a sample size that is feasible for phase I trials. The development of new dose-finding methods is warranted in order to address this issue. Furthermore, such findings were also observed when the number of dose levels was five (result not shown).

Second, the selection rate for the true OD improved as the sample size increased when the number of dose levels was four. We could maintain the selection rate for the true OD at approximately 80% by using the best-performing method under each scenario in many cases. The best-performing method for selecting the true OD was the TC method in scenarios 7 and 8, the WT method in scenario 9, and the CP method in scenario 10; therefore, the performance of each method varied depending on the scenario. It is therefore important to carefully assume the possible dose-efficacy and dose-toxicity curves of the investigational MTA before beginning the trial.

Third, the prior distribution to be assumed for the model parameters of each method did not impact the above-mentioned findings. This would be a desirable operating characteristic of the model-based dose-finding method based on the Bayesian theorem, although it should be fine-tuned through a simulation study before conducting the trial, so that the dose-finding method used provides optimal performance for selecting the true OD in practice.

## Conclusion

In conclusion, when planning the phase I dose-finding trial for a single MTA, we recommend attempting to reduce the number of dose levels, based on data available for the investigational MTA, such as pre-clinical data. When evaluating four dose levels of a single MTA, the three model-based dose-finding methods we evaluated in this study would provide better performance for selecting the true OD for the phase I trial, using a feasible sample size. However, to determine the OD among five or more dose levels, the operating characteristics of the dose-finding method should be carefully examined in a simulation study before beginning the trial.

## Funding Statement

This work was partially supported by JSPS KAKENHI [Grant Number 15K15948] (Grant-in-Aid for Young Scientists B). The views expressed herein are the result of independent work and do not necessarily represent the views of the Pharmaceuticals and Medical Devices Agency.

## References

- O'Quigley J, Pepe M, Fisher L. Continual reassessment method: a practical design for phase I clinical trials in cancer. Biometrics. 1990; 46: 33-48.
- Goodman SN, Zahurak ML, Piantadosi S. Some practical improvements in the continual reassessment method for phase I studies. Stat Med. 1995; 14: 1149-1161.
- Ahn C. An evaluation of phase I cancer clinical trial designs. Stat Med. 1998; 17: 1537-1549.
- Iasonos A, Wilton AS, Riedel ER, Seshan VE, Spriggs DR. A comprehensive comparison of the continual reassessment method to the standard 3+3 dose escalation scheme in phase I dose-finding studies. Clin Trials. 2008; 5: 465-477.
- Braun TM. The bivariate continual reassessment method: extending the CRM to phase I trials of two competing outcomes. Control Clin Trials. 2002; 23: 240-256.
- Thall PF, Cook JD. Dose-finding based on efficacy-toxicity trade-offs. Biometrics. 2004; 60: 684-693.
- Bekele BN, Shen Y. A Bayesian approach to jointly modeling toxicity and biomarker expression in a phase I/II dose-finding trial. Biometrics. 2005; 61: 344-354.
- Hirakawa A. An adaptive dose-finding approach for correlated bivariate binary and continuous outcomes in phase I oncology trials. StatMed. 2012; 31: 516-532.
- Asakawa T, Hamada C. A pragmatic dose-finding approach using short-term surrogate efficacy outcomes to evaluate binary efficacy and toxicity outcomes in phase I cancer clinical trials. Pharm Stat. 2013; 12: 315-327.
- Asakawa T, Hirakawa A, Hamada C. Bayesian model averaging continual reassessment method for bivariate binary efficacy and toxicity outcomes in phase I oncology trials. J Biopharm Stat. 2014; 24: 310-325.
- Wages NA, Tait C. Seamless phase I/II adaptive design for oncology trials of molecularly targeted agents. J Biopharm Stat. 2015; 25: 903-920.
- Murtaugh PA, Fisher LD. Bivariate binary models of efficacy and toxicity in dose-ranging trials. Commun Stat A-Theor. 1990; 19: 2003-2020.
- Sato H, Hirakawa A, Hamada C. An adaptive dose-finding method using a change-point model for molecularly targeted agents in phase I trials. Stat Med. 2016; 35: 4093-4109.
- Islam MA, Chowdhury RI, Briollais L. A bivariate binary model for testing dependence in outcomes. Bull Malaysian Math Sci Soc. 2012; 35: 845-858.
- Rukhin AL. Asymptotic behavior of estimators of the change-point in a binomial probability. J Appl StatSci. 1995; 1: 1-12.
- Monahan J, Genz A. Spherical-radial integration rules for Bayesian computation. J Am Stat Assoc. 1997; 92: 664-674.
- Le Tourneau C, Lee JJ, Siu LL. Dose escalation methods in phase I cancer clinical trials. J Natl Cancer Inst. 2009; 101: 708-720.