WARNING: THIS SITE IS A MIRROR OF GITHUB.COM / IT CANNOT LOGIN OR REGISTER ACCOUNTS / THE CONTENTS ARE PROVIDED AS-IS / THIS SITE ASSUMES NO RESPONSIBILITY FOR ANY DISPLAYED CONTENT OR LINKS / IF YOU FOUND SOMETHING MAY NOT GOOD FOR EVERYONE, CONTACT ADMIN AT ilovescratch@foxmail.com
Skip to content

Commit f13d852

Browse files
authored
Merge branch 'main' into fix-implied-integers
2 parents 21e80ac + 88d0fa7 commit f13d852

File tree

10 files changed

+1291
-0
lines changed

10 files changed

+1291
-0
lines changed

dependencies.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,7 @@ files:
4646
test_python:
4747
output: none
4848
includes:
49+
- cuda_version
4950
- py_version
5051
- depends_on_libcuopt
5152
- depends_on_cuopt

docs/cuopt/source/cuopt-server/index.rst

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -44,3 +44,10 @@ Please refer to following links for more information on API and examples:
4444
:titlesonly:
4545

4646
CSP-Guides<csp-guides/index.rst>
47+
48+
.. toctree::
49+
:caption: NIM Operator
50+
:name: NIM Operator
51+
:titlesonly:
52+
53+
NIM-Operator<nim-operator/index.rst>
Lines changed: 287 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,287 @@
1+
.. _cuopt-nim-configuration:
2+
3+
Configuration Guide
4+
===================
5+
6+
This guide covers configuration options for the CuOpt NIM Operator deployment.
7+
8+
Image Configuration
9+
-------------------
10+
11+
CuOpt Image Versions
12+
^^^^^^^^^^^^^^^^^^^^
13+
14+
Update the image tag in ``cuopt-nimservice.yaml``:
15+
16+
.. list-table::
17+
:header-rows: 1
18+
:widths: 30 70
19+
20+
* - CUDA Version
21+
- Image Tag
22+
* - CUDA 12.9
23+
- ``25.12.0-cuda12.9-py3.13``
24+
25+
.. code-block:: yaml
26+
27+
spec:
28+
image:
29+
repository: nvcr.io/nvidia/cuopt/cuopt
30+
tag: "25.12.0-cuda12.9-py3.13"
31+
pullPolicy: IfNotPresent
32+
33+
Resource Configuration
34+
----------------------
35+
36+
GPU Resources
37+
^^^^^^^^^^^^^
38+
39+
Configure GPU allocation:
40+
41+
.. code-block:: yaml
42+
43+
spec:
44+
resources:
45+
limits:
46+
nvidia.com/gpu: 1 # Number of GPUs
47+
48+
Memory Resources
49+
^^^^^^^^^^^^^^^^
50+
51+
For workloads requiring specific memory allocation:
52+
53+
.. code-block:: yaml
54+
55+
spec:
56+
resources:
57+
limits:
58+
nvidia.com/gpu: 1
59+
memory: "32Gi"
60+
requests:
61+
memory: "16Gi"
62+
63+
Environment Variables
64+
---------------------
65+
66+
CuOpt supports several environment variables for configuration:
67+
68+
.. code-block:: yaml
69+
70+
spec:
71+
env:
72+
- name: CUOPT_DATA_DIR
73+
value: /model-store
74+
- name: CUOPT_SERVER_LOG_LEVEL
75+
value: info # Options: debug, info, warning, error
76+
- name: CUOPT_SERVER_PORT
77+
value: "8000"
78+
79+
Storage Configuration
80+
---------------------
81+
82+
The deployment optionally uses persistent storage so that datasets can be passed through the filesystem
83+
rather than over http. If data is sent over http (the default), this storage is not needed.
84+
85+
.. code-block:: yaml
86+
87+
spec:
88+
storage:
89+
pvc:
90+
create: true
91+
size: 10Gi
92+
storageClass: "" # Uses default storage class
93+
volumeAccessMode: "ReadWriteOnce"
94+
95+
For custom storage class:
96+
97+
.. code-block:: yaml
98+
99+
spec:
100+
storage:
101+
pvc:
102+
create: true
103+
size: 20Gi
104+
storageClass: "fast-ssd"
105+
volumeAccessMode: "ReadWriteOnce"
106+
107+
Networking Configuration
108+
------------------------
109+
110+
Service Configuration
111+
^^^^^^^^^^^^^^^^^^^^^
112+
113+
Default ClusterIP service:
114+
115+
.. code-block:: yaml
116+
117+
spec:
118+
expose:
119+
service:
120+
type: ClusterIP
121+
port: 8000
122+
123+
For NodePort access:
124+
125+
.. code-block:: yaml
126+
127+
spec:
128+
expose:
129+
service:
130+
type: NodePort
131+
port: 8000
132+
nodePort: 30800
133+
134+
For LoadBalancer (cloud environments):
135+
.. note:: Currently the cuopt service does not support scaling; there can only be 1 instance of the pod per service. Therefore a LoadBalancer service is unnecessary.
136+
137+
.. code-block:: yaml
138+
139+
spec:
140+
expose:
141+
service:
142+
type: LoadBalancer
143+
port: 8000
144+
145+
Ingress Configuration
146+
^^^^^^^^^^^^^^^^^^^^^
147+
148+
To expose CuOpt externally via ingress:
149+
150+
.. code-block:: yaml
151+
152+
spec:
153+
expose:
154+
service:
155+
type: ClusterIP
156+
port: 8000
157+
ingress:
158+
enabled: true
159+
spec:
160+
ingressClassName: nginx
161+
rules:
162+
- host: cuopt.example.com
163+
http:
164+
paths:
165+
- backend:
166+
service:
167+
name: cuopt-service
168+
port:
169+
number: 8000
170+
path: /
171+
pathType: Prefix
172+
173+
With TLS:
174+
175+
.. code-block:: yaml
176+
177+
spec:
178+
expose:
179+
ingress:
180+
enabled: true
181+
spec:
182+
ingressClassName: nginx
183+
tls:
184+
- hosts:
185+
- cuopt.example.com
186+
secretName: cuopt-tls-secret
187+
rules:
188+
- host: cuopt.example.com
189+
http:
190+
paths:
191+
- backend:
192+
service:
193+
name: cuopt-service
194+
port:
195+
number: 8000
196+
path: /
197+
pathType: Prefix
198+
199+
Scaling Configuration
200+
---------------------
201+
202+
Currently the cuOpt service does not support scaling. Only a single instance of the pod per service is supported.
203+
204+
Health Probes
205+
-------------
206+
207+
Liveness Probe
208+
^^^^^^^^^^^^^^
209+
210+
Determines if the container is running:
211+
212+
.. code-block:: yaml
213+
214+
spec:
215+
livenessProbe:
216+
enabled: true
217+
probe:
218+
failureThreshold: 3
219+
httpGet:
220+
path: /v2/health/live
221+
port: api
222+
initialDelaySeconds: 15
223+
periodSeconds: 10
224+
successThreshold: 1
225+
timeoutSeconds: 1
226+
227+
Readiness Probe
228+
^^^^^^^^^^^^^^^
229+
230+
Determines if the container is ready to accept traffic:
231+
232+
.. code-block:: yaml
233+
234+
spec:
235+
readinessProbe:
236+
enabled: true
237+
probe:
238+
failureThreshold: 30
239+
httpGet:
240+
path: /v2/health/ready
241+
port: api
242+
initialDelaySeconds: 30
243+
periodSeconds: 10
244+
successThreshold: 1
245+
timeoutSeconds: 1
246+
247+
Startup Probe
248+
^^^^^^^^^^^^^
249+
250+
For slower starting containers:
251+
252+
.. code-block:: yaml
253+
254+
spec:
255+
startupProbe:
256+
enabled: true
257+
probe:
258+
failureThreshold: 30
259+
httpGet:
260+
path: /v2/health/ready
261+
port: api
262+
periodSeconds: 10
263+
264+
Monitoring Configuration
265+
------------------------
266+
267+
Enable Prometheus metrics and ServiceMonitor:
268+
269+
.. code-block:: yaml
270+
271+
spec:
272+
metrics:
273+
enabled: true
274+
serviceMonitor:
275+
additionalLabels:
276+
release: kube-prometheus-stack
277+
278+
Full Configuration Example
279+
--------------------------
280+
281+
Here's a complete production-ready configuration:
282+
283+
:download:`cuopt-nimservice-full.yaml <guide/cuopt-nimservice-full.yaml>`
284+
285+
.. literalinclude:: guide/cuopt-nimservice-full.yaml
286+
:language: yaml
287+
:linenos:

0 commit comments

Comments
 (0)