An ndarray is NumPy's n-dimensional array object, which is usually multidimensional, homogeneous, fixed-size after creation, and optimized for numerical operations.

What topics are typically covered in NumPy interviews?

NumPy interviews often cover array storage, vectorization vs Python loops, slicing and indexing, broadcasting, and practical coding without unnecessary loops.

What is the goal of using this NumPy interview guide?

The goal is to reason from text shape, dtype, axis, and indexing rule to manage most NumPy interview questions effectively.

NumPy Interview Questions: Arrays,

NumPy Interview Questions: Arrays, Broadcasting, Views, Copies, Random, and Practical Coding

NumPy interviews usually test more than function names.

They check whether you understand how arrays are stored, why vectorization is faster than Python loops, when slicing creates a view, when indexing creates a copy, how broadcasting works, and how to solve small data problems without writing unnecessary loops.

This guide is written as a practical interview-preparation file. Each answer is original and uses simple examples from analytics, student scores, product data, images, and machine learning workflows.

You will prepare for questions about:

ndarray, shape, dtype, ndim, size, itemsize, and strides
arrays vs Python lists
vectorization and ufuncs
views, copies, shallow-looking array behavior, and .base
basic indexing, boolean indexing, and fancy indexing
broadcasting rules and np.newaxis
axis-based aggregation
sorting, ranking, filtering, clipping, and set operations
random number generation with default_rng
allclose and floating-point comparison
meshgrid, swapaxes, tile, repeat, and count_nonzero
image-like arrays
structured arrays
saving and loading NumPy data
code-output and debugging interview tasks

How To Use This Guide

Read each question and try to answer it before reading the explanation.

For code-output questions, write the output on paper first. In interviews, the goal is not only to know NumPy functions. The goal is to reason from:

text

shape + dtype + axis + indexing rule

If you can explain those four things clearly, most NumPy interview questions become manageable.

1. What Is NumPy?

NumPy is a Python library for numerical computing.

Its main object is the ndarray, which stores values in a compact, typed, multidimensional array.

Interview answer:

NumPy is used for fast numerical operations on arrays. It provides vectorized operations, broadcasting, efficient memory storage, mathematical functions, random number generation, and linear algebra tools. Many libraries such as Pandas, scikit-learn, TensorFlow, PyTorch, and image-processing tools use NumPy-style arrays internally or at their boundaries.

Example:

python

import numpy as np

prices = np.array([100, 150, 200])
discounted = prices * 0.9

print(discounted)

Output:

text

[ 90. 135. 180.]

2. What Is An `ndarray`?

An ndarray is NumPy's n-dimensional array object.

It is usually:

multidimensional
homogeneous, meaning values normally share one dtype
fixed-size after creation
optimized for numerical operations

Example:

python

arr = np.array([[1, 2, 3], [4, 5, 6]])

print(type(arr))
print(arr.shape)
print(arr.dtype)

Output:

text

<class 'numpy.ndarray'>
(2, 3)
int64

The exact integer dtype may be int32 on some systems.

3. How Is A NumPy Array Different From A Python List?

A Python list stores references to Python objects. It can hold mixed types and can grow dynamically.

A NumPy array stores fixed-size values of a common dtype in a compact memory layout.

Interview answer:

Lists are general-purpose containers. NumPy arrays are specialized numerical containers. Arrays support vectorized operations, use memory more compactly for numerical data, and work naturally with multidimensional shapes.

Example:

python

print([1, 2, 3] * 2)
print(np.array([1, 2, 3]) * 2)

Output:

text

[1, 2, 3, 1, 2, 3]
[2 4 6]

4. What Do `shape`, `ndim`, `size`, `dtype`, And `itemsize` Mean?

Use this array:

python

data = np.array([
    [10, 20, 30],
    [40, 50, 60],
], dtype=np.int32)

Attributes:

python

print(data.shape)
print(data.ndim)
print(data.size)
print(data.dtype)
print(data.itemsize)

Output:

text

(2, 3)
2
6
int32
4

Meaning:

shape: size along each dimension
ndim: number of dimensions
size: total number of elements
dtype: data type of each element
itemsize: bytes used by one element

5. What Are Strides?

Strides tell NumPy how many bytes to move in memory to reach the next element along each axis.

Example:

python

arr = np.arange(12, dtype=np.int32).reshape(3, 4)

print(arr)
print(arr.strides)

Possible output:

text

[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]
(16, 4)

Why?

Each int32 value uses 4 bytes.
Moving one column moves 4 bytes.
Moving one row moves 4 columns x 4 bytes = 16 bytes.

Interview answer:

Strides describe how an n-dimensional index maps to the underlying memory buffer. They are one reason NumPy can create views, transpose arrays, and slice arrays without always copying data.

6. What Is A View?

A view is a new array object that looks at the same underlying data as another array.

Example:

python

arr = np.array([10, 20, 30, 40])
view = arr[1:3]

view[0] = 999

print(arr)
print(view)

Output:

text

[ 10 999  30  40]
[999  30]

Changing the view changed the original.

Interview answer:

A view shares the original array's data buffer. It can be faster and memory-efficient, but changes through one array may appear in the other.

7. What Is A Copy?

A copy owns separate data.

Example:

python

arr = np.array([10, 20, 30, 40])
copy_arr = arr[1:3].copy()

copy_arr[0] = 999

print(arr)
print(copy_arr)

Output:

text

[10 20 30 40]
[999  30]

The original did not change.

8. How Can You Check Whether An Array Is A View?

Use .base as a learning/debugging clue.

python

arr = np.arange(6)
view = arr[1:4]
copy_arr = arr[[1, 2, 3]]

print(view.base is arr)
print(copy_arr.base is None)

Output:

text

True
True

Important interview point:

basic slicing usually creates views
advanced indexing usually creates copies

9. What Is The Difference Between Basic Indexing And Advanced Indexing?

Basic indexing uses integers, slices, ellipsis, and None or np.newaxis.

Advanced indexing uses integer arrays or boolean arrays.

Example:

python

arr = np.arange(9).reshape(3, 3)

basic = arr[1:, :]
advanced = arr[[1, 2], :]

print(basic)
print(advanced)

Both may look similar, but their memory behavior differs.

Interview answer:

Basic slicing generally returns a view. Advanced indexing generally returns a copy. This matters for memory usage and whether changes affect the original array.

10. What Is Boolean Indexing?

Boolean indexing selects values where a condition is true.

python

scores = np.array([45, 72, 88, 39, 91])

passed = scores[scores >= 50]

print(passed)

Output:

text

[72 88 91]

The condition creates a boolean mask:

python

print(scores >= 50)

Output:

text

[False  True  True False  True]

11. What Is Fancy Indexing?

Fancy indexing selects values using arrays or lists of indexes.

python

scores = np.array([45, 72, 88, 39, 91])

selected = scores[[0, 2, 4]]

print(selected)

Output:

text

[45 88 91]

Fancy indexing is useful for selecting specific rows, columns, or records.

12. What Is Broadcasting?

Broadcasting is NumPy's rule for operating on arrays with different shapes.

Example:

python

matrix = np.array([
    [10, 20, 30],
    [40, 50, 60],
])

bonus = np.array([1, 2, 3])

print(matrix + bonus)

Output:

text

[[11 22 33]
 [41 52 63]]

The 1D bonus array is applied across each row.

Interview answer:

Broadcasting lets NumPy perform element-wise operations on compatible shapes without physically copying the smaller array across the larger one.

13. What Are The Broadcasting Rules?

Compare shapes from right to left.

Two dimensions are compatible if:

they are equal, or
one of them is 1

Example:

text

(4, 3)
(   3)

Compatible because the last dimension is 3.

Example:

text

(4, 3)
(4,)

Not compatible because the trailing dimensions are 3 and 4.

Code:

python

a = np.zeros((4, 3))
b = np.array([1, 2, 3])

print((a + b).shape)

Output:

text

(4, 3)

14. How Does `np.newaxis` Help Broadcasting?

np.newaxis adds a dimension of size 1.

python

row = np.array([1, 2, 3])
column = row[:, np.newaxis]

print(row.shape)
print(column.shape)

Output:

text

(3,)
(3, 1)

Create an outer addition table:

python

a = np.array([10, 20, 30])
b = np.array([1, 2, 3, 4])

result = a[:, np.newaxis] + b

print(result)

Output:

text

[[11 12 13 14]
 [21 22 23 24]
 [31 32 33 34]]

15. What Is Vectorization?

Vectorization means applying operations to entire arrays instead of writing Python loops over individual elements.

Loop version:

python

values = [10, 20, 30]
result = []

for value in values:
    result.append(value * 2)

NumPy version:

python

values = np.array([10, 20, 30])
result = values * 2

Interview answer:

Vectorization is faster because NumPy performs the loop in optimized compiled code and avoids much of the overhead of Python-level iteration.

16. What Are Ufuncs?

Ufunc means universal function.

Ufuncs apply element-wise operations efficiently.

Examples:

python

arr = np.array([1, 4, 9, 16])

print(np.sqrt(arr))
print(np.add(arr, 10))

Output:

text

[1. 2. 3. 4.]
[11 14 19 26]

Common ufuncs include:

np.add
np.subtract
np.multiply
np.divide
np.sqrt
np.exp
np.log
np.sin

17. What Is The Difference Between `axis=0` And `axis=1`?

For a 2D array:

axis=0 works down rows and returns one value per column
axis=1 works across columns and returns one value per row

Example:

python

marks = np.array([
    [70, 80, 90],
    [60, 75, 85],
])

print(marks.sum(axis=0))
print(marks.sum(axis=1))

Output:

text

[130 155 175]
[240 220]

Interview shortcut:

The axis you pass is the axis that gets reduced.

18. Why Use `keepdims=True`?

keepdims=True keeps reduced axes as dimensions of size 1.

This is useful for broadcasting.

python

marks = np.array([
    [70, 80, 90],
    [60, 75, 85],
])

row_mean = marks.mean(axis=1, keepdims=True)
centered = marks - row_mean

print(row_mean.shape)
print(centered)

Output:

text

(2, 1)
[[-10.   0.  10.]
 [-13.33333333   1.66666667  11.66666667]]

Without keepdims=True, broadcasting may fail or mean something different.

19. What Is `reshape()`?

reshape() changes the shape of an array without changing the number of elements.

python

arr = np.arange(12)
matrix = arr.reshape(3, 4)

print(matrix)

Output:

text

[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]

This fails:

python

np.arange(12).reshape(5, 3)

because 12 values cannot fill 15 positions.

20. Does `reshape()` Return A View Or A Copy?

Often, reshape() can return a view, but it depends on memory layout.

Interview answer:

reshape() returns a view when possible. If the requested shape cannot be represented with compatible strides, NumPy may need a copy or may raise an error in some in-place reshape situations.

Practical advice:

python

arr = np.arange(12)
reshaped = arr.reshape(3, 4)

print(reshaped.base is arr)

Use .base only as a learning/debugging tool, not as business logic.

21. What Is The Difference Between `ravel()` And `flatten()`?

Both convert an array to 1D.

Important difference:

ravel() returns a view when possible
flatten() always returns a copy

Example:

python

matrix = np.arange(6).reshape(2, 3)

flat_view = matrix.ravel()
flat_copy = matrix.flatten()

flat_view[0] = 999
flat_copy[1] = 888

print(matrix)

Output:

text

[[999   1   2]
 [  3   4   5]]

flat_copy did not affect the original.

22. What Is The Difference Between `transpose`, `.T`, And `swapaxes`?

For 2D arrays, .T and transpose() both swap rows and columns.

python

matrix = np.array([
    [1, 2, 3],
    [4, 5, 6],
])

print(matrix.T)

Output:

text

[[1 4]
 [2 5]
 [3 6]]

For higher dimensions, swapaxes() swaps two chosen axes.

python

arr = np.zeros((2, 3, 4))

print(np.swapaxes(arr, 0, 2).shape)

Output:

text

(4, 3, 2)

Interview answer:

.T reverses axes. transpose() can reorder axes explicitly. swapaxes() swaps exactly two axes.

23. What Is `np.expand_dims()`?

np.expand_dims() inserts a new axis.

python

arr = np.array([10, 20, 30])

row = np.expand_dims(arr, axis=0)
column = np.expand_dims(arr, axis=1)

print(row.shape)
print(column.shape)

Output:

text

(1, 3)
(3, 1)

It is commonly used when a model expects a batch dimension.

24. What Is `np.squeeze()`?

np.squeeze() removes axes of length 1.

python

arr = np.zeros((1, 3, 1, 4))

print(np.squeeze(arr).shape)

Output:

text

(3, 4)

Use it carefully. Removing a batch dimension accidentally can break model input shapes.

25. What Is The Difference Between `np.concatenate`, `vstack`, `hstack`, And `stack`?

concatenate joins arrays along an existing axis.

python

a = np.array([[1, 2]])
b = np.array([[3, 4]])

print(np.concatenate((a, b), axis=0))

Output:

text

[[1 2]
 [3 4]]

vstack stacks vertically.

hstack stacks horizontally.

stack creates a new axis.

python

x = np.array([1, 2])
y = np.array([3, 4])

print(np.stack((x, y), axis=0))
print(np.stack((x, y), axis=1))

Output:

text

[[1 2]
 [3 4]]
[[1 3]
 [2 4]]

Interview answer:

Use concatenate when joining along an existing dimension. Use stack when creating a new dimension.

26. What Is The Difference Between `np.tile()` And `np.repeat()`?

tile() repeats the whole array pattern.

python

arr = np.array([1, 2, 3])

print(np.tile(arr, 2))

Output:

text

[1 2 3 1 2 3]

repeat() repeats individual elements.

python

print(np.repeat(arr, 2))

Output:

text

[1 1 2 2 3 3]

For 2D arrays, axis controls the direction for repeat.

python

matrix = np.array([[1, 2], [3, 4]])

print(np.repeat(matrix, 2, axis=0))

Output:

text

[[1 2]
 [1 2]
 [3 4]
 [3 4]]

27. What Is `np.where()`?

np.where() has two common uses.

Find positions:

python

scores = np.array([45, 80, 62, 30])

print(np.where(scores >= 60))

Output:

text

(array([1, 2]),)

Choose values conditionally:

python

labels = np.where(scores >= 60, "pass", "retry")

print(labels)

Output:

text

['retry' 'pass' 'pass' 'retry']

28. What Is `np.clip()`?

np.clip() limits values to a minimum and maximum.

python

values = np.array([-5, 10, 50, 120])

print(np.clip(values, 0, 100))

Output:

text

[  0  10  50 100]

Use it for outlier control, image pixel limits, probability bounds, and safe feature ranges.

29. What Is `np.count_nonzero()`?

It counts non-zero values.

python

arr = np.array([
    [1, 0, 3],
    [0, 0, 6],
])

print(np.count_nonzero(arr))
print(np.count_nonzero(arr, axis=0))
print(np.count_nonzero(arr, axis=1))

Output:

text

3
[1 0 2]
[2 1]

It is often used to count true values because True behaves like 1 and False like 0.

python

scores = np.array([45, 80, 62, 30])

print(np.count_nonzero(scores >= 60))

Output:

text

30. What Is `np.allclose()` And Why Is It Important?

Floating-point values can have tiny precision differences.

Do not compare floats using exact equality when small numerical error is expected.

python

a = np.array([0.1 + 0.2])
b = np.array([0.3])

print(a == b)
print(np.allclose(a, b))

Output:

text

[False]
True

Interview answer:

np.allclose() checks whether arrays are element-wise equal within a tolerance. It is useful for testing numerical code where tiny floating-point differences are acceptable.

31. What Is The Difference Between `np.random.seed()` And `default_rng()`?

np.random.seed() controls legacy global random state.

Modern NumPy code should prefer np.random.default_rng().

python

rng = np.random.default_rng(42)

print(rng.integers(1, 10, size=5))

Interview answer:

default_rng() creates an independent random generator object. It avoids relying on shared global state and is the recommended approach for new code.

32. How Do You Generate Random Integers, Uniform Values, And Normal Values?

python

rng = np.random.default_rng(7)

integers = rng.integers(1, 101, size=(2, 3))
uniform_values = rng.uniform(0, 1, size=5)
normal_values = rng.normal(loc=0, scale=1, size=5)

print(integers)
print(uniform_values)
print(normal_values)

Use:

integers for random integer ranges
uniform for continuous values in a range
normal for Gaussian-like data

33. What Is The Difference Between `shuffle()` And `choice()`?

shuffle() rearranges an array in place.

python

rng = np.random.default_rng(10)
arr = np.array([1, 2, 3, 4, 5])

rng.shuffle(arr)

print(arr)

choice() samples values.

python

rng = np.random.default_rng(10)
arr = np.array([1, 2, 3, 4, 5])

print(rng.choice(arr, size=3, replace=False))

Use replace=False when the same item should not be selected twice.

34. What Is `np.meshgrid()`?

meshgrid() creates coordinate grids from coordinate vectors.

python

x = np.array([0, 1, 2])
y = np.array([10, 20])

xx, yy = np.meshgrid(x, y)

print(xx)
print(yy)

Output:

text

[[0 1 2]
 [0 1 2]]
[[10 10 10]
 [20 20 20]]

Interview answer:

meshgrid() is useful for evaluating a function on a 2D grid, plotting surfaces, creating coordinate maps, or generating image-style coordinate arrays.

35. What Are Structured Arrays?

Structured arrays let each element contain named fields.

python

students = np.array(
    [
        ("Asha", 92, 8.7, True),
        ("Ravi", 78, 7.9, False),
    ],
    dtype=[
        ("name", "U20"),
        ("score", "i4"),
        ("cgpa", "f4"),
        ("placed", "?"),
    ],
)

print(students["name"])
print(students["score"])

Output:

text

['Asha' 'Ravi']
[92 78]

Interview answer:

Structured arrays are useful when each record has named fields, but for general tabular analytics Pandas is often more convenient.

36. How Are Images Represented As NumPy Arrays?

A grayscale image can be a 2D array:

text

(height, width)

A color image is often a 3D array:

text

(height, width, channels)

For RGB images, channels are usually 3.

Common operations:

python

image = np.zeros((100, 200, 3), dtype=np.uint8)

print(image.shape)
print(image.dtype)

Output:

text

(100, 200, 3)
uint8

Examples:

python

flipped_vertical = np.flip(image, axis=0)
flipped_horizontal = np.flip(image, axis=1)
darkened = np.clip(image * 0.7, 0, 255).astype(np.uint8)
negative = 255 - image
cropped = image[20:80, 50:150]

37. What Is The Difference Between `np.save`, `np.load`, And `np.savetxt`?

np.save() stores one array in NumPy's binary .npy format.

python

arr = np.array([1, 2, 3])

np.save("numbers.npy", arr)
loaded = np.load("numbers.npy")

print(loaded)

np.savetxt() stores text data such as CSV-like output.

Binary .npy is usually better for preserving dtype and shape.

Use np.savez() or np.savez_compressed() for multiple arrays.

38. Code Output: Slicing View

Question:

python

arr = np.array([10, 20, 30, 40])
view = arr[1:3]
view[1] = 999

print(arr)

Answer:

text

[ 10  20 999  40]

Explanation:

view shares data with arr. view[1] corresponds to arr[2].

39. Code Output: Fancy Indexing Copy

Question:

python

arr = np.array([10, 20, 30, 40])
selected = arr[[1, 2]]
selected[0] = 999

print(arr)
print(selected)

Answer:

text

[10 20 30 40]
[999  30]

Fancy indexing returned a copy.

40. Code Output: Broadcasting

Question:

python

a = np.array([[1], [2], [3]])
b = np.array([10, 20, 30, 40])

print((a + b).shape)
print(a + b)

Answer:

text

(3, 4)
[[11 21 31 41]
 [12 22 32 42]
 [13 23 33 43]]

Shapes:

text

(3, 1)
(4,)

Broadcast to:

text

(3, 4)

41. Code Output: Axis Reduction

Question:

python

arr = np.array([
    [1, 2, 3],
    [4, 5, 6],
])

print(arr.sum(axis=0))
print(arr.sum(axis=1))

Answer:

text

[5 7 9]
[ 6 15]

42. Code Output: `tile` vs `repeat`

Question:

python

arr = np.array([1, 2, 3])

print(np.tile(arr, 2))
print(np.repeat(arr, 2))

Answer:

text

[1 2 3 1 2 3]
[1 1 2 2 3 3]

43. Code Output: `allclose`

Question:

python

a = np.array([0.1 + 0.2])
b = np.array([0.3])

print(a == b)
print(np.allclose(a, b))

Answer:

text

[False]
True

The exact binary representation of decimal fractions can produce tiny differences.

44. Debugging: Why Does This Broadcasting Fail?

Question:

python

sales = np.zeros((4, 3))
bonus = np.array([1, 2, 3, 4])

sales + bonus

Answer:

This fails because shapes are:

text

(4, 3)
(4,)

Broadcasting compares from the right:

text

3 vs 4

They are not equal, and neither is 1.

Fix by making bonus a column:

python

bonus = bonus.reshape(4, 1)
print((sales + bonus).shape)

Output:

text

(4, 3)

45. Debugging: Why Did My Original Array Change?

Question:

python

data = np.arange(10)
part = data[2:5]
part[:] = -1

print(data)

Answer:

part is a view created by slicing, so it shares memory with data.

Output:

text

[ 0  1 -1 -1 -1  5  6  7  8  9]

Fix:

python

part = data[2:5].copy()

46. Debugging: Why Is `arr == np.nan` Always False?

NaN is not equal to itself.

python

arr = np.array([1.0, np.nan, 3.0])

print(arr == np.nan)

Output:

text

[False False False]

Correct:

python

print(np.isnan(arr))

Output:

text

[False  True False]

47. Debugging: Why Did Integer Division Become Float?

python

arr = np.array([1, 2, 3])

print((arr / 2).dtype)
print(arr // 2)

Output:

text

float64
[0 1 1]

/ performs true division and can produce floats. // performs floor division.

48. Coding Task: Normalize Each Row

Question:

Normalize each row using:

text

(row - row_min) / (row_max - row_min)

Solution:

python

data = np.array([
    [10, 20, 30],
    [2, 4, 8],
    [100, 150, 200],
])

row_min = data.min(axis=1, keepdims=True)
row_max = data.max(axis=1, keepdims=True)

normalized = (data - row_min) / (row_max - row_min)

print(normalized)

Output:

text

[[0.  0.5 1. ]
 [0.  0.33333333 1. ]
 [0.  0.5 1. ]]

49. Coding Task: Find Rows With Any Value Greater Than X

python

arr = np.array([
    [1, 2, 3],
    [10, 2, 1],
    [3, 9, 4],
])

x = 6

rows = np.where((arr > x).any(axis=1))[0]

print(rows)

Output:

text

[1 2]

50. Coding Task: Remove Minimum And Maximum Values

Remove every occurrence of the minimum and maximum values.

python

arr = np.array([4, 9, 1, 3, 9, 2, 1, 7])

minimum = arr.min()
maximum = arr.max()

result = arr[(arr != minimum) & (arr != maximum)]

print(result)

Output:

text

[4 3 2 7]

51. Coding Task: Sort Rows By Second Column

python

data = np.array([
    [101, 75],
    [102, 92],
    [103, 60],
])

sorted_rows = data[np.argsort(data[:, 1])]

print(sorted_rows)

Output:

text

[[103  60]
 [101  75]
 [102  92]]

Descending:

python

sorted_rows_desc = data[np.argsort(data[:, 1])[::-1]]

52. Coding Task: Add Total Column And Get Top 2

python

marks = np.array([
    [70, 80, 90],
    [60, 75, 85],
    [95, 91, 93],
    [50, 65, 70],
])

total = marks.sum(axis=1, keepdims=True)
with_total = np.concatenate((marks, total), axis=1)

ranked = with_total[np.argsort(with_total[:, -1])[::-1]]

print(ranked[:2])

Output:

text

[[ 95  91  93 279]
 [ 70  80  90 240]]

53. Coding Task: Unique Rows

python

records = np.array([
    [1, 10],
    [2, 20],
    [1, 10],
    [3, 30],
])

print(np.unique(records, axis=0))

Output:

text

[[ 1 10]
 [ 2 20]
 [ 3 30]]

54. Coding Task: Count Category Frequencies

python

labels = np.array(["free", "pro", "free", "team", "pro", "free"])

categories, counts = np.unique(labels, return_counts=True)

print(categories)
print(counts)

Output:

text

['free' 'pro' 'team']
[3 2 1]

55. Coding Task: Build A Distance Matrix

Given points on a line:

python

points = np.array([1, 4, 9])

Create pairwise absolute distances.

python

distance = np.abs(points[:, np.newaxis] - points[np.newaxis, :])

print(distance)

Output:

text

[[0 3 8]
 [3 0 5]
 [8 5 0]]

This uses broadcasting.

56. Coding Task: Euclidean Distance From A Target Point

python

points = np.array([
    [2, 3],
    [5, 7],
    [1, 8],
])

target = np.array([3, 4])

distances = np.sqrt(((points - target) ** 2).sum(axis=1))

print(distances)

Output:

text

[1.41421356 3.60555128 4.47213595]

57. Coding Task: Create A Checkerboard Matrix

python

board = np.zeros((6, 6), dtype=int)
board[::2, ::2] = 1
board[1::2, 1::2] = 1

print(board)

Output:

text

[[1 0 1 0 1 0]
 [0 1 0 1 0 1]
 [1 0 1 0 1 0]
 [0 1 0 1 0 1]
 [1 0 1 0 1 0]
 [0 1 0 1 0 1]]

58. Coding Task: Replace Outliers With Boundary Values

python

values = np.array([5, 12, 40, 99, 120, -3])

cleaned = np.clip(values, 0, 100)

print(cleaned)

Output:

text

[  5  12  40  99 100   0]

59. Coding Task: Find Common Product IDs

python

batch_a = np.array([101, 102, 103, 104])
batch_b = np.array([103, 104, 105, 106])

print(np.intersect1d(batch_a, batch_b))
print(np.setdiff1d(batch_a, batch_b))
print(np.union1d(batch_a, batch_b))

Output:

text

[103 104]
[101 102]
[101 102 103 104 105 106]

60. Coding Task: Use `meshgrid` To Evaluate A Function

python

x = np.array([0, 1, 2])
y = np.array([10, 20])

xx, yy = np.meshgrid(x, y)

z = xx + yy

print(z)

Output:

text

[[10 11 12]
 [20 21 22]]

61. Interview Answer: How Would You Improve Slow NumPy Code?

Strong answer:

First, I would check whether the code is using Python loops over array elements. Then I would look for vectorization, broadcasting, ufuncs, axis-based reductions, and boolean masks. I would also avoid repeated appends inside loops because NumPy arrays are fixed-size; it is better to collect data first or preallocate the final array. Finally, I would check unnecessary copies, dtype choices, and memory layout if performance still matters.

62. Interview Answer: When Should You Not Use NumPy?

Strong answer:

NumPy is not ideal for mixed object-heavy data, heavily nested Python objects, row-by-row business logic, or datasets too large for memory unless paired with chunking or other tools. For labeled tabular data, Pandas is often more ergonomic. For GPU tensor work, PyTorch, TensorFlow, JAX, or CuPy may be better depending on the project.

63. Interview Answer: Why Can Broadcasting Be Dangerous?

Broadcasting can silently create a result with a valid but unintended shape.

Example:

python

a = np.ones((3, 1))
b = np.ones((1, 4))

print((a + b).shape)

Output:

text

(3, 4)

This is correct mathematically, but if you expected a 1D result, it is a bug.

Good habit:

python

print(a.shape, b.shape)

before combining arrays.

64. Interview Answer: Why Can Copies Hurt Performance?

Copies use extra memory and time.

If you slice a huge array and can work with a view safely, it can be faster and more memory-efficient.

But views can cause accidental mutation.

Strong answer:

Views are efficient but share data. Copies are safer when independence matters. The right choice depends on whether the downstream code should be allowed to affect the original data.

65. Interview Answer: Why Does dtype Matter?

dtype controls:

memory usage
numerical range
precision
operation results
compatibility with libraries

Example:

python

a = np.array([1, 2, 3], dtype=np.int8)
b = np.array([1, 2, 3], dtype=np.float64)

print(a.itemsize)
print(b.itemsize)

Output:

text

1
8

Using a smaller dtype can save memory, but it can also overflow if values exceed the dtype range.

66. Quick Revision Table

Topic	Interview point
`ndarray`	typed, multidimensional array
`shape`	size of each dimension
`dtype`	type and storage format of elements
`strides`	bytes to move along each axis
view	shares data
copy	owns separate data
basic slicing	usually view
advanced indexing	usually copy
broadcasting	compatible shape expansion without manual loops
`axis`	dimension being reduced or operated along
`keepdims`	keeps reduced axes for broadcasting
`ravel`	view when possible
`flatten`	copy
`default_rng`	recommended random generator constructor
`allclose`	tolerance-based float comparison
`tile`	repeats whole pattern
`repeat`	repeats individual elements
`meshgrid`	coordinate grids
structured array	records with named fields

Answer 1

NumPy arrays are faster because values are stored in a compact typed buffer, and operations run in optimized compiled code instead of Python-level loops.

Solution Key

Answer 2

Shapes:

text

(5, 1)
(3,)

Result:

text

(5, 3)

Solution Key

Answer 3

Basic slices usually create views, so modifying the slice can modify the original array.

Solution Key

Answer 4

python

arr = np.array([
    [1, 2, 3],
    [4, -1, 6],
    [7, 8, 9],
])

rows = arr[(arr < 0).any(axis=1)]

print(rows)

Explanation

A 2D NumPy array arr is created with integers, including a negative value (-1).
The expression (arr < 0).any(axis=1) generates a boolean array indicating which rows contain at least one negative value.
The original array arr is indexed with this boolean array to extract the rows that meet the condition.
The resulting rows are stored in the variable rows and printed, showing only the rows with negative values.

Solution Key

Answer 5

python

data = np.array([
    [10, 100],
    [20, 150],
    [30, 200],
])

col_min = data.min(axis=0, keepdims=True)
col_max = data.max(axis=0, keepdims=True)

normalized = (data - col_min) / (col_max - col_min)

print(normalized)

Explanation

The code initializes a 2D NumPy array named data with specific values.
It calculates the minimum values for each column using data.min(axis=0, keepdims=True), preserving the array's dimensions.
Similarly, it computes the maximum values for each column with data.max(axis=0, keepdims=True).
The normalization formula (data - col_min) / (col_max - col_min) is applied to scale the data to a range between 0 and 1.
Finally, the normalized array is printed, showing the transformed values.

Solution Key

Answer 6

python

arr = np.array([12, 99, 4, 42, 18, 77])

top_3 = np.sort(arr)[-3:][::-1]

print(top_3)

Explanation

The code initializes a NumPy array arr containing six integer values.
It sorts the array in ascending order using np.sort(arr).
The last three elements of the sorted array, which are the highest values, are selected with [-3:].
The selected values are then reversed to present them in descending order using [::-1].
Finally, the top three values are printed to the console.

Solution Key

Answer 7

python

arr = np.array([1, 2, 2, 3, 4, 4, 4])

values, counts = np.unique(arr, return_counts=True)
duplicates = values[counts > 1]

print(duplicates)

Explanation

The code initializes a NumPy array arr containing integers, some of which are duplicated.
It uses np.unique() to find unique values in the array while also counting their occurrences, returning two arrays: values and counts.
The duplicates array is created by filtering values where the corresponding counts are greater than 1, indicating duplicates.
Finally, it prints the duplicates array, which contains the values that appear more than once in the original array.

Solution Key

Answer 8

python

arr = np.array([1.0, np.nan, 3.0, np.nan])

cleaned = np.where(np.isnan(arr), 0, arr)

print(cleaned)

Explanation

The code initializes a NumPy array arr containing floating-point numbers, including NaN values.
It uses np.isnan(arr) to create a boolean mask identifying the NaN elements in the array.
The np.where function replaces NaN values with 0 while keeping other values unchanged.
The resulting array cleaned is printed, showing the original values with NaN replaced by 0.

Alternative:

python

cleaned = np.nan_to_num(arr, nan=0.0)

Explanation

Utilizes the np.nan_to_num() function from the NumPy library to handle NaN values.
The input arr is a NumPy array that may contain NaN (Not a Number) entries.
Any NaN values found in arr are replaced with 0.0, ensuring the output array cleaned has no NaN values.
This is useful for data preprocessing, especially before performing mathematical operations or analyses that cannot handle NaN values.

Solution Key

Answer 9

python

identity = np.eye(5)

print(identity)

Explanation

The code utilizes the NumPy library, which is commonly used for numerical operations in Python.
np.eye(5) creates a 5x5 identity matrix, where all the diagonal elements are 1 and all other elements are 0.
The print(identity) statement outputs the generated identity matrix to the console.
Identity matrices are useful in various mathematical computations, including linear algebra and transformations.

Solution Key

Answer 10

python

arr = np.array([[1, 2], [3, 4]])

np.save("arr.npy", arr)
loaded = np.load("arr.npy")

print(loaded)

Explanation

The code creates a 2D NumPy array named arr containing the values [[1, 2], [3, 4]].
It uses np.save to save the array to a file called "arr.npy" in binary format.
The array is then loaded back into memory using np.load, retrieving the saved data into the variable loaded.
Finally, the loaded array is printed to the console, displaying its contents.

70. Common Mistakes To Avoid

Mistake 1: Not checking shapes

Most NumPy bugs are shape bugs.

Always inspect:

python

print(arr.shape)

Explanation

The code uses the print function to output information to the console.
arr.shape accesses the shape attribute of a NumPy array, which returns a tuple representing the dimensions of the array.
This is useful for understanding the structure of the data, such as the number of rows and columns in a 2D array.
The output will vary depending on the specific shape of the arr array being analyzed.

Mistake 2: Confusing views and copies

If you modify a slice and the original changes, you probably had a view.

Use:

python

arr[2:5].copy()

Explanation

The code accesses a portion of the list arr from index 2 to index 4 (5 is exclusive).
The copy() method is called on the sliced portion, ensuring that a new list is created rather than a reference to the original.
This is useful for modifying the copied list without affecting the original list.
The resulting copied list contains the elements from the specified range of the original list.

when independence matters.

Mistake 3: Comparing floats with exact equality

Use:

python

np.allclose(a, b)

Explanation

Utilizes the NumPy library function np.allclose() to compare two arrays, a and b.
Returns True if all elements of the arrays are equal within a specified tolerance, otherwise returns False.
Useful for numerical comparisons where floating-point precision issues may arise.
The function allows for customization of relative and absolute tolerances through optional parameters.

when tiny numerical differences are acceptable.

Mistake 4: Using loops for simple array operations

Prefer:

python

arr * 2
arr[arr > 0]
arr.sum(axis=1)

Explanation

The expression arr * 2 scales each element of the array arr by a factor of 2, effectively doubling its values.
The expression arr[arr > 0] filters the array to include only the elements that are greater than zero, creating a new array with positive values.
The method arr.sum(axis=1) computes the sum of elements along the specified axis (rows in this case), returning a new array with the sum of each row's elements.

over manual loops when possible.

Mistake 5: Using `np.append()` repeatedly in a loop

NumPy arrays are fixed-size. Repeated appends create repeated allocations.

Better approaches:

collect values in a Python list, then convert once
preallocate the final NumPy array
use concatenate once when possible

Final Summary

For NumPy interviews, remember these core ideas:

ndarray is a typed multidimensional array.
Shape tells you structure; dtype tells you storage and numerical behavior.
Strides explain how NumPy walks through memory.
Basic slicing usually creates views.
Advanced indexing usually creates copies.
Broadcasting compares shapes from right to left.
Vectorization avoids Python-level loops.
axis tells NumPy which dimension to operate over or reduce.
keepdims=True keeps dimensions useful for broadcasting.
default_rng() is preferred for modern random number generation.
allclose() is better than exact equality for many floating-point checks.
Image data, structured records, and grids are all natural NumPy use cases.

The best interview answers are short, accurate, and supported by a small example.

Sources and Further Reading

NumPy documentation: https://numpy.org/doc/
NumPy ndarray reference: https://numpy.org/doc/stable/reference/generated/numpy.ndarray.html
NumPy ndarray user guide: https://numpy.org/doc/stable/reference/arrays.ndarray.html
NumPy copies and views: https://numpy.org/doc/stable/user/basics.copies.html
NumPy broadcasting guide: https://numpy.org/doc/stable/user/basics.broadcasting.html
NumPy random Generator: https://numpy.org/doc/stable/reference/random/generator.html
NumPy strides reference: https://numpy.org/doc/stable/reference/generated/numpy.ndarray.strides.html
NumPy allclose reference: https://numpy.org/doc/stable/reference/generated/numpy.allclose.html
NumPy structured arrays: https://numpy.org/doc/stable/user/basics.rec.html
NumPy I/O routines: https://numpy.org/doc/stable/reference/routines.io.html

NumPy Interview Questions: Arrays, Broadcasting & More

AI Insights

Answer 1

Answer 2

Answer 3

Answer 4

Answer 5

Answer 6

Answer 7

Answer 8

Answer 9

Answer 10