I use the gbm library in R and I would like to use all my CPU to fit a model.
gbm.fit(x, y,
offset = NULL,
misc = NULL,...
Well, there cannot be a parallel implementation of GBM in principle, neither in R neither nor in any other implementation. And the reason is very simple: the boosting algorithm is by definition sequential.
Consider the following, quoted from The Elements of Statistical Learning, Ch. 10 (Boosting and Additive Trees), pp. 337-339 (emphasis mine):
A weak classifier is one whose error rate is only slightly better than random guessing. The purpose of boosting is to sequentially apply the weak classification algorithm to repeatedly modified versions of the data, thereby producing a sequence of weak classifiers Gm(x), m = 1, 2, . . . , M. The predictions from all of them are then combined through a weighted majority vote to produce the final prediction. [...] Each successive classifier is thereby forced to concentrate on those training observations that are missed by previous ones in the sequence.
In a picture (ibid, p. 338):

In fact, this is frequently noted as a key disadvantage of GBM relative to, say, Random Forest (RF), where the trees are independent and can thus be fitted in parrallel (see the bigrf R package).
Hence, the best you can do, as the commenters above have pinpointed, is to use your excess CPU cores to parallelize the cross-validation process...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With